Tech Insights

AI insights: Get ready to accelerate time to value

Learn how HPE, NVIDIA, WekaIO, and Mellanox have designed a deep learning architecture that accelerates AI insights.

AI_deep learning_HPE_blog.jpgDeep learning (DL) architectures offer organizations a way to accelerate AI insights, enabling them to process hundreds of millions of data points and generate AI-based analytics—without slowing down their systems.

At HPE, we're offering our technology and expertise to data scientists, solution builders, and IT personnel who recognize the need to successfully implement AI projects. We understand the unique needs of organizations that might hesitate to build the complex IT infrastructure needed to deliver AI insights—which is why we've designed solutions that make it easy for them.

A scalable, shared AI storage solution

To build a storage solution that could accelerate AI training and inferencing, we've collaborated with our partners to develop a scalable, shared storage solution that runs on a neural network. Engineers from HPE, NVIDA, WekaIO, and Mellanox designed a DL architecture that provides high performance for DL training and validation workflows.

We built a benchmark configuration that can handle wide-ranging use cases, such as autonomous-vehicle development, fraud detection, and video-surveillance services, as well as tissue classification in medical images. What we've learned is that successful AI projects working with data at the petascale range require not only the right supporting hardware and software, but also the right methodology. This methodology must involve multiple key steps, such as cleansing and preprocessing techniques that prepare the data to be trained in a DL model.

Once the model is trained, data scientists must validate it to ensure it meets production inference requirements. In the development phase, the model must also be tested in a batch inferencing or simulated environment.

What HPE DL infrastructure can offer businesses

It's important that organizations make the right decisions when it comes to balancing components, such as the number of GPUs, servers, network interconnects, and storage types (local or shared). With our partners, HPE has integrated several technologies into a DL infrastructure solution that helps optimize those organizational decisions.

This DL architecture includes the HPE Apollo 6500 Gen10 system, which provides GPUs, fast GPU interconnects, high-bandwidth fabric, and a configurable GPU topology to handle different workloads. The HPE Apollo 6500 Gen10 system supports up to eight NVIDIA Tesla V100 SXM2 32 GB GPU modules. Powered by NVIDIA Volta architecture, the Tesla V100 is a GPU designed to accelerate AI, high-performance computing, and graphics.

For networking capabilities, Mellanox switches, cables, and network adapters provide high performance for an HPE Apollo 6500 Gen10 system in a DL solution. Mellanox offers Ethernet and InfiniBand interconnects for high-performance GPU clusters used for DL workloads and storage interconnects.

To reduce storage compute time, HPE engineers chose WekaIO Matrix, which includes the MatrixFS flash-optimized parallel-file system. For AI architectures, MatrixFS offers distributed data and metadata support to avoid the hotspots and bottlenecks encountered by traditional scale-out storage solutions.

Along with technology, we also provide guidance from resources such as the HPE Deep Learning Cookbook and the HPE Deep Learning Performance Guide. Additionally, we offer consulting services to help clients develop their AI solutions.

Performance testing for real-world AI projects

To exemplify how our performance tests can speed AI development efforts, HPE and our partners created a test bed for running training and inference workloads using a single HPE Apollo 6500 Gen10 system with eight NVIDIA Tesla V100 SXM2 16 GB GPUs. To compare storage requirements under training and inference scenarios, tests were performed on these two storage configurations:

  • A single NVMe SSD local to the HPE Apollo 6500 Gen10 system, using the XFS file system
  • Eight HPE ProLiant DL360 Gen10 Servers running WekaIO Matrix and containing a total of 32 NVMe SSDs, using the Matrix POSIX client, with the HPE Apollo 6500 connecting to the cluster using the Mellanox 100 Gbps EDR InfiniBand

Both storage configurations used a modified ImageNet data set. The images were decomposed to tensors of 16-bit floating-point format. Each tensor file contained 20,000 images.

For the inference runtime, TensorRT 3.0.4 was used along with the HPE Deep Learning Benchmark Suite, which automates benchmarking, collects performance measurements, and supports various models such as VGGs, ResNets, AlexNet, and GoogleNet. Numerous tests were performed with various batch sizes to test I/O demands, storage requirements, and the performance of bandwidth services during the inference training stage.

The results: faster time to development and AI insights

Large enterprise organizations eager to use AI insights to gain a competitive advantage in their fields should know that, based on the tests performed, the HPE DL architecture reduces overall AI development time. Other test highlights include:

  • The inference benchmarks showed that when benchmarks were run on local NVMe drives, it resulted in an I/O bottleneck, because the I/O requirements were significantly more demanding. By contrast, the shared storage solution more than doubled its performance at scale.
  • The combination of WekaIO Matrix and parallel I/O to the cluster of NVMe drives and Mellanox 100 Gbps InfiniBand interconnect provided the network bandwidth to service the I/O demands of the inference workload.
  • WekaIO Matrix's performance meets or exceeds a local file system's as the number of GPUs scale, and is higher than that of traditional NFS file systems. We expect even greater gains when training data sets increase in capacity and clusters scale out beyond a single GPU client.

As the pressure to establish AI projects increases and the demand to gain AI insights becomes a greater business imperative, solutions like our DL architecture will become more critical. Our system is easy to install and can meet the demands of large organizations seeking to better manage their AI projects and mitigate the troubles of building, running, and testing AI data sets.

Take the first step to DL success

For a comprehensive summary of what our DL solutions can do, read more about how this infrastructure can accelerate the time to value of your AI projects and drive AI insights into your business.

Featured article:

Pankaj Goyal
Vice President, HPE AI Business
Hewlett Packard Enterprise

0 Kudos
About the Author


Pankaj is building HPE’s Artificial Intelligence business. He is excited by the potential of AI to improve our lives, and believes HPE has a huge role to play. In his past life, he has been a computer science engineer, an entrepreneur, and a strategy consultant. Reach out to him to discuss everything AI @HPE.

Starting June 22
HPE Discover 2021
THE FUTURE IS EDGE TO CLOUD Prepare for the next wave of digital transformation. Join our global virtual event. June 22 – 24
Read more
HPE Webinars
Find out about the latest live broadcasts and on-demand webinars
Read more
View all