HPE Ezmeral: Uncut
Chad_Smykay

Optimize resources by converging HPC and big data

The explosion of data and the ability to spread high-powered computers wherever a company desires has led to new possibilities combining high performance computing (HPC) and big data.

HPC-&-BigData-b.jpgIn the past, HPC had its own data pipelines crafted to allow the data to be prepared and made available to workloads that required large numbers of processors. Meanwhile, big data had been developed mostly to process, in a batch mode, the huge volumes of data that suddenly became available and could lead to novel insights. But the goal was simply to process the data in batch mode, not get the job done in any speedy or efficient fashion.

Now we have entered an era where we are no longer bound by methods that are slow and not scalable. New technologies allow us to use object storage and other forms in which big data is stored and retrieve it at high speeds. These advances have opened the door to companies exploring the idea of combining the massive computing power of HPC workloads with that of big data workloads.

With the right software platform like HPE Ezmeral, companies with extensive HPC and big data workloads can now reallocate their hardware portfolio across the workloads in a flexible way, including allocating GPU resources to workloads as needed. This type of optimization offers the most efficient resource utilization. In other words, you can squeeze every last drop of value out of your computing landscape.

Read on to learn what is now possible for HPC and big data workloads and why companies should take advantage of this new flexibility.

Why HPE and big data have not been combined before now

Until now, HPC and big data workloads have been, often with good reasons, separate domains unto themselves. The two workloads utilize network resources in contrasting ways. HPC workloads, as their name suggests, require high performance, whereas big data workloads don’t necessarily require the same. HPC workloads typically demand far more CPUs and need a faster network.  In the past, combining HPC and big data workloads was difficult because of a lack of tools on the HPC side to provide orchestration and load management.

While software has moved forward and widely applied popular open-source techniques like Kubernetes, adoption of these technologies in the HPC realm has been slower. But that is changing. With Kubernetes, companies now have a resource scheduler that can work with HPC workloads. This capability offers a level of automation that hasn’t been possible for HPC before and allows companies to move workloads and data easily.

With a containerized infrastructure, a business can take any workload (even one created for a big data processing engine like Spark) and move it to run on HPC infrastructure. The reverse is also true; HPC workloads can run with the same resources that are running that big data Spark workload.

Thus, the true benefit of containerized HPC infrastructure comes from the ability to move workloads to where they can be run most efficiently. Companies can reuse their resources with this approach and thus achieve significant cost savings. Workloads are now portable in a way they hadn’t been up until this point.

A solution to combine workloads

This type of containerized approach is now supported with HPE Ezmeral Runtime Enterprise, which in turn supports InfiniBand networks where HPC workloads often run. HPE Ezmeral allows companies to abstract hardware away from the workload, so companies no longer need dedicated hardware for HPC or dedicated hardware for big data workloads. They can have one set of integrated hardware that can be reconfigured within seconds to run either as HPC workloads or big data workloads.

Part of the reason these new capabilities are so promising is because they enable companies to overcome the competing requirements to run HPC and big data workloads. With the ability for the same hardware to do both, companies benefit from simpler infrastructure, efficiencies, and cost savings. HPE Ezmeral gives companies the flexibility to spin up or down hardware resources such as processor, memory, GPU, and networking capabilities as they see fit.  This can be done on a minute, hourly, or daily basis, which means companies can examine all their workloads and compute resources — then optimize resource utilizations across all types of jobs: HPC, big data, and AL/ML.

How this works in the real world

HPC&BigData-b2.jpg

An example of what this looks like in practice comes from the energy sector, where an HPE Ezmeral client has taken the first step on their journey to combine such resources.  Previously, the business had a rigid HPC infrastructure that did not allow for sharing HPC resources across their customized HPC applications product for their customers. Resources were siloed, leading to users having to manually reassign hardware, which required days of planning to shift from one customer workload to another. In partnership with HPE Ezmeral Professional Services and Engineering teams, the company was able to port their application into a container while supporting their CPU, GPU, and memory needs, as well as their current Infiniband networking needs. The business can now move workloads with a couple of clicks or a few API calls to reallocate resources for HPC workloads in seconds-to-minutes, instead of days or weeks.

Why this matters

With HPE Ezmeral, companies can containerize their HPC applications seamlessly and not lose functionality by doing so. In fact, companies will experience significant advantages with this approach because Kubernetes allows users to run workloads, wherever and however they want. By abstracting their big data, HPC workloads, and hardware, they can purchase less hardware, storage, and memory, and put their existing resources to work more efficiently and effectively. As a result, the limits of the past will become a distant memory. It’s time to stop treating HPC and big data as separate computing environments.

Watch the HPE Ezmeral Runtime Enterprise video to learn how companies can support traditional big data applications and modernize HPC applications while sharing compute resources.

Chad 

Hewlett Packard Enterprise
twitter.com/HPE_Ezmeral

linkedin.com/showcase/hpe-ezmeral
hpe.com/software

0 Kudos
About the Author

Chad_Smykay

Chad Smykay has extensive background in operations with his time at USAA as well as at Rackspace (a world class support organization), where he built shared services solutions. More recently, he has helped implement many production, big data/data lake solutions. As an earlier adopter of Kubernetes in the application space coupled with data analytics use cases, he brings a breadth of background in the application modernization space for business use cases.