Advancing Life & Work

MLCommons™ for Machine Learning Benchmarking Launches with HPE as a Founding Member

As of December 3, 2020, leading machine learning (ML) benchmark ML Perf™ is under a new non-profit organization called MLCommons™. MLCommons™ is an open engineering consortium that brings together academic and industry to develop the MLPerf™ benchmarks, best practices and publicly available datasets. Benchmarks like MLPerf™ serve a critical function by creating a shared understanding of performance and progress. These standardized benchmarks are critical for consumers and manufacturers to compare how various products perform on a level playing field, thereby improving competition and innovation in the marketplace, and helping the whole industry focus on the right problems – moving everyone forward.


As a founding member of MLCommons™, HPE is strategically aligned to help the marketplace set specific benchmarks for how machine learning performance gets measured and help our customers make more informed decisions on their AI infrastructure. Previously, there were no such standards and consumers had many questions including:

  • What is the best hardware and software to run these workloads?
  • Is storage important and when do CPUs become a bottleneck?
  • What is the role of memory and do I need to buy the most expensive GPU?
  • Do I need an ultra-fast interconnect between GPUs to run typical deep learning workloads?

In 2018, MLPerf™ was established as a result of previously existing benchmark efforts in various industries and academia. A collaboration with a large number of companies including HPE, start-ups, and universities resulted in multiple standardized deep learning benchmarks that are now widely recognized in the market. The creation of MLCommons™ is the next step in this evolution to create even better benchmarks for the marketplace.

“HPE joined MLPerf™ as a supporting organization and became a founding member of MLCommons™ because of our expertise in creating hardware optimized for deep learning workloads,” says Sergey Serebryakov, Hewlett Packard Labs senior research engineer. “HPE benchmark and performance engineers have been running deep learning benchmarks and optimizing our systems for many years, and we would like to help shape future benchmarks that represent real-world workloads of our customers.”

Serebryakov has been working with the MLPerf™ Best Practices working group on MLCube announced today, which reduces friction for machine learning by ensuring models are easily portable and reproducible (e.g., between stacks such as different clouds, between cloud and on-prem, etc.). Jacob Balma, HPC AI engineering researcher at HPE, co-chair of the MLPerf™ HPC working group, has helped develop deep learning benchmarks for high performance computing (HPC) systems which expose file system I/O, communication bottlenecks, and convergence differences at scale between hardware.

The MLPerf™ HPC benchmark suite includes two benchmarks that capture key characteristics of scientific ML workloads on HPC systems such as volumetric (3D), multi-channel scientific data, and dataset sizes in the 5-9 TB range. First results were announced on November 18th, 2020. It included submissions from several submitters in the top 500 supercomputer list.

In the ever-changing world of machine learning, artificial intelligence (AI), and HPC, HPE and MLCommons™ will continue to work closely together to support common like-for-like benchmarks, and develop and share public data sets and best practices that will accelerate innovation, raise all ships and increase ML’s positive impact on society.


Curt Hopkins
Hewlett Packard Enterprise

About the Author


Managing Editor, Hewlett Packard Labs