- Integrated Systems
- About Us
- Integrated Systems
- About Us
Starting your AI journey with a solid partnership: HPE Apollo, WekaFS™ and NVIDIA GDS®
The competitive stakes are high in the race for artificial intelligence (AI) innovations. If you haven’t already started your AI journey, starting now is the key to keeping pace with the competition. While embarking on any journey requires advanced preparation, if your final destination is important enough, ultimately you just have to begin. So how do you get going?
Guest blog by Ken Grohe, President & CRO, WekaIO
No matter what subject your AI initiative quest covers, such as autonomous driving, cures for disease, or vaccines for life-changing pandemics like COVID-19, making your AI setup effective and competitive requires a solid foundation of data science talent within your organization and three essential elements that work together in harmony to effectively arm the data scientists with the tools they need for the quick wins they want. What are these essential elements?
- A server platform providing unprecedented performance through accelerated compute with NVIDIA® GPUs, think HPE Apollo 6500
- A fast network, think NVIDIA® Mellanox®
- A modern parallel file system like WekaIO™ to manage the data, implemented by HPE as the HPE Solution for Weka
This solution leverages that fast network and the balanced performance and flexibility of HPE ProLiant DL360 servers to provide an infrastructure on which Weka is, as HPE described in this blog on AI data stores, “superfast” and can best “feed the beast” of data-hungry GPUs, providing the best utilization of the Apollo 6500s and the engineers and scientists who use them. These elements can be depicted as a triangle, with all sides fitting together and sitting firmly on your organization’s foundation of talent. (See Figure 1.)
Where AI meets HPC
Let’s step back from the triangle for a moment and look at the changing landscape. Historically, high-performance computing (HPC) and AI were two distinct markets, but now there is a convergence of HPC and AI. Whereas HPC traditionally was focused on a relatively small number of large organizations on the edges of enterprise computing that led crazy-big research projects and used enormous data clusters, these days AI, machine learning (ML), and deep learning (DL) have become HPC in the enterprise. It’s speculated that by 2022 the commercial mainstream market will be in full production for AI/ML. Admittedly, many organizations are just in the budding phases of their initiatives, but most will be putting real resources and energy behind their AI efforts by 2022, and that’s just around the corner.
GPUs are the workhorses of computing
To elaborate upon the first essential element in our isosceles triangle, let’s talk about what's happening in HPC and AI enterprises now. The modern buyer journey involves investing in server platforms that can leverage compute acceleration technologies, like GPUs. AI needs a powerful compute infrastructure to explore, extract, and examine the data to gain deep insights and deliver breakthrough results, and GPUs are at the heart of modern supercomputing.
As the quintessential workhorses and multitaskers, GPUs easily manage the most complex data sets in AI workloads. Platforms like the Apollo 6500 provide the balance of accelerated compute, memory, and high speed NVLINK interconnects to process those workloads with unprecedented performance. This results in not only faster results but high quality since more iterations means more accuracy.
“GPU performance has continued to grow, data movement becomes increasingly important, and WekaIO has pioneered an impressive modern parallel file system that delivers important capabilities to accelerate AI and workloads at scale.” – Jeff Herbst, Vice President, Business Development, NVIDIA
Bigger, better, faster, stronger
Our second essential element is a fast network. Data centers carry heavy loads as they try to keep up with the growth that’s necessary to stay competitive in the world of AI. Everything is getting bigger: application size, data size, cluster size, compute size, and more. The networking portion is arguably the most difficult.
Nevertheless, networking continues to improve with high-speed and low-latency solutions that replace the aging Fibre Channel and Ethernet links to speed data transfers from your network to your servers and storage systems. We now have 100gig and 200gig networking, and that’s where Mellanox lives. HPE is an OEM for Mellanox switches and NICs, so customers can obtain the full solution from HPE. Plus, we hear more and more about long-range plans to create datacenter-scale computing architectures in which the network will become part of the computing fabric. HPE, WekaIO, and NVIDIA Mellanox and its partners have strong relationships, built to support each other in the need for networking speed.
WekaFS™ for the win
If you have great networking, and you have powerhouse compute platforms with workhorse GPUs, you might think you’re set. Think again. A modern file system, our third essential element, is required to get the most out of the other two elements in our triangle. When companies put their GPU technology into production, often they haven’t considered the ability of their storage infrastructures to support their data-hungry beasts. GPUs sit idle because existing storage infrastructures can't get the data to the application servers fast enough. Yes, organizations are spending billions of dollars on their IT infrastructures, but some are implementing the same technologies—which they’ve bought for years—and with traditional storage file systems layered on top, there’s a bottleneck.
Admittedly, some existing products are fine within standard swim lanes, but modern workloads can require a high-performance, scalable parallel file system that solves today’s biggest storage problems and accelerates modern IO-intensive workloads. They need high bandwidth and metadata-dense performance (mixed workloads with billions of small files). With the global 2000 customers that actively work in AI and ML at scale, WekaFS™ is the only file system to consider because it breaks the bottleneck imposed by current storage file systems. Weka touches the revenue-generating enterprise applications, providing first-to-market competitive advantages and moving our customers’ top line by reducing their time to market.
WekaFS is now incorporated into a solution by HPE Complete, HPE’s one-stop shop for validated HPE and third-party partner end-to-end infrastructure solutions. Customers can obtain a bundled solution incorporating WekaFS on the HPE ProLiant DL360 Gen10 server, the 2P/1U dense compute standard with exceptional flexibility and unmatched expandability for multi-workload environments. This solution includes not only the GPU-attached AI/ML solutions being discussed here, but also workloads in Life Sciences and Financial Services.
The HPE solution for Weka offers customers a total solution, which includes a DL360 server populated with fast NVMe drives, Mellanox networking with Weka software that manages WekaFS and object tiering. (See Figure 2 for a deployment architecture.)
Let’s face it. If AI is mainstream by 2022, no organization can afford to ignore the bottleneck of their storage file system when it needs access to an extreme amount of data at a high rate of speed. WekaIO’s message of being the choice “for those who solve big problems” resonates with customers because WekaFS delivers storage that’s an order of magnitude faster across mixed workloads at exabyte scale, so it will undoubtedly resonate with future AI enterprise markets as they go to production and as they scale. Moreover, as WekaFS matures to include additional enterprise data management services, the enterprise can "have its cake and eat it too." Customers get their fast performance and their required features. It’s a win-win.
“This high-performance combination enables our customers to accelerate extracting value from their data. By combining our high performing, secure and versatile HPE ProLiant DL360 Gen 10 server with WekaFS, we deliver a compelling storage solution that meets the performance demands of AI/ML, NLP and other GPU-attached workloads.” – Chris Powers, VP, Collaborative Platform Development, HPE Storage and Big Data, HPE
Effectively arm data scientists for success
The triangle is a strong architectural element that has been used since ancient Greek times. It’s simple yet powerful design provides a strong structure when built upon a solid foundation. For our discussion, it helps to illustrate any organization’s need to employ three essential elements when embarking on an AI journey and designing an architecture for success.
If you are just beginning your AI journey or if you are looking for a performance file system that effectively arms your data scientists with the tools they need for the quick wins they want, contact HPE and Weka to get started!
More information resources from HPE
Meet guest blogger Ken Grohe, President & CRO, WekaIO. A highly seasoned veteran of the industry, Ken Grohe previously served as President & CRO at Stellus, President of SignNow, SVP and GM of Barracuda Networks, and CRO of Virident, a Western Digital Company. He also had an impressive 25-year career at Dell EMC, finishing as VP and GM with a focus on the global flash business.
Advantage EX Experts
Hewlett Packard Enterprise