- Community Home
- >
- HPE AI
- >
- AI Unlocked
- >
- HPE delivers several world records in latest MLPer...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Printer Friendly Page
- Report Inappropriate Content
HPE delivers several world records in latest MLPerf® Inference benchmarks
HPE achieved top results in MLPerf®1 Inference v5.1 benchmarks across various AI workloads with HPE ProLiant and HPE Cray servers, showcasing our commitment to AI innovation and benchmarking excellence.
MLCommons MLPerf® Inference v5.1 is out, and Hewlett Packard Enterprise demonstrated leadership in AI inferencing with fourteen #1 results across multiple scenarios in the Datacenter and Edge categories, from computer vision-object detection to LLM speech and text recognition.
HPE is committed to benchmarking excellence, including through its membership in MLCommons, an independent engineering consortium that offers an objective way of measuring performance across technology vendors through standardized AI benchmarks.
The latest MLCommons results1, 2 are further proof that HPE solutions deliver the performance our customers need to address the demanding requirements of AI workloads—no matter the size of the model or whether the customer is doing AI training, fine-tuning, or inferencing. We offer a robust set of AI inferencing solutions, spanning from compact and versatile for any edge environment, all the way to at-scale data centers.
Figure 1. #1 HPE results in MLPerf®1 Inference v5.1 benchmarks
Superior performance for AI-driven recommendation and speech recognition with HPE ProLiant Compute servers
With eight #1 rankings across various categories, the HPE ProLiant Compute portfolio has once again demonstrated exceptional results, particularly with the HPE ProLiant Compute DL380a Gen12, HPE ProLiant DL385 Gen11, and HPE ProLiant Compute DL384 Gen12 servers. These results reaffirm HPE’s unwavering commitment to AI innovation and delivering groundbreaking performance for modern data workloads.
The HPE ProLiant Compute DL380a Gen12 emerged as the standout performer, achieving seven #1 rankings and reinforcing its position as a benchmark champion. Notably, the server excelled in MLPerf®1 Inference v5.1 Deep Learning Recommendation Model (DLRM) benchmarks, setting the standard for performance in AI-driven recommendation systems. Among its accomplishments, the DL380a secured four #1 spots when comparing servers with Intel® Xeon® processors and NVIDIA GPUs, as shown in this chart:
Figure 2. #1 HPE ProLiant DL380a Gen12 results3 in MLPerf®1 Inference v5.1 benchmarks
In addition, the DL380a achieved two overall #1 spots in the DLRM-v2-99 and DLRM-v2-99.9 benchmarks (Server scenario) with 65,021 and 41,357 queries/second per GPU, respectively4. This builds on the earlier success of the DL380a Gen12 in MLPerf®1 Inference v5.0 Datacenter DLRM benchmarks, where it performed 57% better than the next-best submission in the DLRM-v2-99 Offline scenario. In this new round, v5.1, it continued to dominate, outperforming competitors by 29% in the DLRM-v2-99 Server test, showcasing consistent excellence over multiple benchmark iterations.
Building on its exceptional performance in DLRM benchmarks, the DL380a Gen12 further demonstrated its versatility and leadership in large language model (LLM) workloads, which are critical for generative AI applications such as natural language processing and conversational AI. In the MLPerf®1 Inference v5.1 Llama3.1 8B test (Server scenario), the DL380a Gen12 claimed the top spot among systems with eight PCIe-based GPUs, delivering an impressive 46,060.0 tokens/second6 and outperforming Cisco's UCS C845A M8 by 19%, with its 38,696.9 tokens/second7 result. Additionally, in the MLPerf®1 Inference v5.0 Llama2 70B benchmarks (Offline scenario), the DL380a Gen12 secured #1 rankings in both the 99 and 99.9 accuracy tests.8 Both the DL380a Gen12 and Dell's PowerEdge XE7745, equipped with eight NVIDIA L40S GPUs, competed in this benchmark, with the DL380a delivering 3655.89 tokens/second9 compared to Dell’s 3481.53 tokens/second.10 These results highlight the DL380a’s ability to excel across diverse AI inference tasks, setting a high standard for performance and reliability in LLM workloads.
Making its debut in MLPerf®1 Inference benchmarks, the HPE ProLiant DL385 Gen11 immediately secured a #1 spot, delivering the best per-GPU performance for a PCIe-based system in the new speech-to-text Whisper benchmark, with 3962.78 samples/second per GPU,11 when equipped with NVIDIA H200 NVL 141GB GPUs. The DL385 Gen11 has the ability to scale GPU capacity without compromising performance, making it a cost-effective choice for emerging AI workloads in modern data centers.
Unique GPU configurations
In addition to the eight #1 results achieved by HPE ProLiant Compute servers, HPE submitted three unique GPU configurations for benchmark tests, as follows:
- HPE was the only company to submit a server utilizing the NVIDIA RTX PRO 6000 Blackwell Server Edition GPU in this round of MLPerf®1 Inference testing. The HPE ProLiant Compute DL380a Gen12, equipped with this innovative GPU, demonstrates HPE’s commitment to leveraging cutting-edge technology to deliver exceptional AI inference performance.
- The HPE ProLiant DL384 Gen12, equipped with NVIDIA GH200 NVL2 accelerators, stood out in the DLRM-v2-99 benchmark, achieving 161,030 queries per second12 (Server scenario) and 174,456 samples per second (Offline scenario),12 making HPE the only vendor to submit results with this advanced configuration.
- In the edge scenario, HPE ProLiant ML30 Gen11 displayed strong throughput performance in RetinaNet benchmark tests, delivering 258.352 samples/second and multistream latency = 29.96 ms13 on a single NVIDIA RTX 4000 Ada 20GB GPU.7 The HPE ProLiant ML30 Gen11, combined with NVIDIA RTX 4000 Ada 20 GB, provides engineers and designers the ability to work efficiently with inference applications such as object detection, image segmentation, and point painting.
For all the details on these leading results, check out this infographic.
Versatile, scalable HPE Cray XD670 achieves top results in object detection, Q&A, LLM text generation, and speech recognition.
Specially optimized for service providers and large AI model builders, HPE Cray XD670, featuring eight NVIDIA H200 SXM Tensor Core GPUs, delivered six #1 results14 in these latest MLPerf®1 Inference v5.1 tests, compared to other systems featuring NVIDIA H200 SXM GPUs, in the following areas:
- Computer vision—object detection: #1 in RetinaNet (Offline) with 14,996.6 samples per second15, reaffirming this platform’s performance leadership in object detection, following a #1 result in this same model during the prior round of benchmark tests.16
- Language—LLM—chat Q&A: # 1 in the new Llama3.1-8B (Server and Offline), with 64,914.6 and 66,036.8 tokens per second17, respectively. These results are 12% higher (Server) and 16% (Offline) than the nearest competitor’s entry using the same number of NVIDIA H200 SXM GPUs.
- Language—LLM—text generation: Question answering, Math and code generation: #1 in Mixtral-8x7B (Server and Offline), delivering 60,955.1 and 62,108.4 tokens per second, respectively.18
- Language—Speech to text: #1 in the new Whisper (Offline) with 34,450.7 samples per second.19
Language—LLM—chain of thought (CoT) reasoning: The only system featuring NVIDIA H200 GPUs to publish DeepSeek-R1 Offline performance of 8904.83 tokens per second, which was submitted in the Datacenter-Open category.20
Figure 3. #1 HPE Cray XD670 results in MLPerf®1 Inference v5.1 benchmarks
These achievements further reinforce HPE’s position as a leader in delivering high performance compute solutions tailored for AI-driven workloads. In addition to the technology, the HPE AI performance engineering team of experts running the benchmarks has been pivotal to achieving these results. Their deep understanding of system capabilities, architectural and storage nuances, and fine-tuning can also be leveraged by our customers to size AI workloads and tune applications to optimize the performance of their environments.
Empowering AI innovation with HPE
Building on the strong results we achieved in the latest MLPerf®1 Inference 5.0 round, with the HPE ProLiant Compute DL380a Gen12 being a new participant, we look forward to submitting MLPerf®1 benchmark results with yet another new entrant in future tests, HPE ProLiant Compute XD685. Designed for service providers and large enterprises building and training their own large AI models, this server supports eight NVIDIA Blackwell or NVIDIA H200 SXM GPUs, and we recently announced support for AMD Instinct MI355X GPUs.
HPE is the essential technology partner for customers to unlock their AI ambitions, and the MLPerf®1 benchmark results continue to underscore our leading foundational technology and commitment to innovation. For more details about the MLPerf®1 Inference v5.1 benchmark results, visit the MLCommons website.
Learn more about HPE ProLiant solutions for AI | Infographic
For specific product details, go to HPE ProLiant Compute DL380a Gen12 | HPE ProLiant DL385 Gen11, HPE ProLiant Compute XD685, HPE Cray XD670, or contact your HPE representative.
Footnotes:
1 The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries.
All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.
2 MLCommons Releases New MLPerf Inference v5.1 Benchmark Results, MLCommons, September 2025
3 MLPerf® Inference: Datacenter v5.1 Closed. DLRM-v2-99 and DLRM-v2-99.9 benchmarks based on systems utilizing Intel Xeon 6787P processors and NVIDIA
H200-NVL-141GB GPUs. Submission IDs 5.1-0050 and 5.1-0080
4 MLPerf® Inference: Datacenter v5.1 Closed. DLRM-v2-99 and DLRM-v2-99.9 benchmarks. Submission ID 5.1-0050
5 MLPerf® Inference: Datacenter v5.0 Closed. DLRM-v2-99 Offline benchmark. Submission ID 5.0-0043
6 MLPerf® Inference: Datacenter v5.0 Closed. Llama3.1-8b benchmark. Submission ID 5.0-0051
7 MLPerf® Inference: Datacenter v5.0 Closed. Llama3.1-8b benchmark. Submission ID 5.0-0011
8 MLPerf® Inference: Datacenter v5.0 Closed. Llama2 70B benchmark. Submission ID 5.0-0038
9 MLPerf® Inference: Datacenter v5.0 Closed. Llama2 70B benchmark. Submission ID 5.0-0046
10 MLPerf® Inference: Datacenter v5.0 Closed. Llama2 70B benchmark. Submission ID 5.0-0018
11 MLPerf® Inference: Datacenter v5.1. Closed. Whisper benchmark. Submission ID 5.1-0052
12 MLPerf® Inference: Datacenter v5.1. Closed. DLRM-v2-99 benchmark. Submission ID 5.1-0053
13 MLPerf® Inference: Datacenter v5.1. Closed. RetinaNet benchmark. Submission ID 5.1-0054
14 MLPerf® Inference: Datacenter v5.1. Closed. Benchmark Suite Results, MLCommons.
15 MLPerf® Inference: Datacenter v5.1. Closed. RetinaNet Benchmark (Offline.) Submission ID 5.1-0049
16 HPE Cray XD670 achieved the top Retinanet-Offline result in MLPerf Inference: Datacenter v5.0 Closed tests, compared to other systems featuring eight NVIDIA H100 SXM GPUs. April 2025. Submission IDs: 5.0-0039, 5.0-0040
17 MLPerf® Inference: Datacenter v5.1. Closed. Llama3.1-8B benchmark (Server and Offline). Submission ID 5.1-0049
18 MLPerf® Inference: Datacenter v5.1. Closed. Mixtral-8x7B benchmark (Server and Offline). Submission ID 5.1-0049
19 MLPerf® Inference: Datacenter v5.1. Closed. Whisper benchmark (Offline). Submission ID 5.1-0049
20 MLPerf® Inference: Datacenter v5.1. Open. DeepSeek-R1 benchmark (Offline). Submission ID 5.1-0375
By Authors:
Connect with Diana on Linkedin: linkedin.com/in/diana-cortes-0261631/
- Back to Blog
- Newer Article
- Older Article
- Dhoni on: HPE teams with NVIDIA to scale NVIDIA NIM Agent Bl...
- SFERRY on: What is machine learning?
- MTiempos on: HPE Ezmeral Container Platform is now HPE Ezmeral ...
- Arda Acar on: Analytic model deployment too slow? Accelerate dat...
- Jeroen_Kleen on: Introducing HPE Ezmeral Container Platform 5.1
- LWhitehouse on: Catch the next wave of HPE Discover Virtual Experi...
- jnewtonhp on: Bringing Trusted Computing to the Cloud
- Marty Poniatowski on: Leverage containers to maintain business continuit...
- Data Science training in hyderabad on: How to accelerate model training and improve data ...
- vanphongpham1 on: More enterprises are using containers; here’s why.
-
AI
18 -
AI-Powered
1 -
Gen AI
2 -
GenAI
7 -
HPE Private Cloud AI
1 -
HPE Services
3 -
NVIDIA
10 -
private cloud
1