Accelerate data pipelines to improve data science productivity

JoannStarke · ‎03-22-2022

Five decades ago, the first central processing unit (CPU) was introduced and quickly became the workhorse of both business and consumer computational power. The age of insights and analytics has changed that. With data growing at double-digit rates year over year and the adoption of analytics, AI, and ML technologies, the phrase “I need more power, Scotty” is more than a demand from Captain Kirk in a Star Trek movie.

Present-day data lakes, machine learning, model training, and production environments are dense and complex environments that contain datasets spanning different technologies and frameworks. In the age of analytics, speed is everything and processing a collection of technologies and frameworks can constrain system performance.

Graphic processing units, or GPUs, were originally designed for rapid calculations primarily for video rendering. But as open-source tools such as Apache Spark™ were developed and analytic models grew in both size and density, GPUs quickly became the preferred computational power source to process trillions of objects or millions of database rows. Using CPUs, processing of these volumes could take months.

HPE Ezmeral is a hybrid analytics and data science platform designed to drive the modernization critical to becoming a data-driven organization. NVIDIA and HPE Ezmeral have teamed up to create a powerful solution that lowers risk and reduces time to insight. Watch this video to learn more. NVIDIA RAPIDS Accelerator and Triton Inference Server are a part of the HPE GreenLake Marketplace, which provides customers with a choice of certified partner solutions such as Dremio, Presto, and Starburst.

In this age of advanced analytics, Apache Spark has become the industry standard for developers and data science teams to create the machine learning models data-driven organizations need. The combination of Apache Spark, HPE Ezmeral, and NVIDIA RAPIDS Accelerator allows customers to quickly act on untapped value by simplifying data collection and analytics. Simplifying and accelerating data pipelines translate into greater productivity from analytics and data science teams. It means they can get to those projects that have been on the “to do” list that could make your business more competitive or provide the next business innovation. Watch this video to learn more:

Recently, Enterprise Strategy Group (ESG) evaluated how HPE Ezmeral Runtime Enterprise and NVIDIA RAPIDS Accelerator can increase the performance of Apache Spark workloads. In this technical brief, ESG demonstrated how HPE Ezmeral Runtime Enterprise, NVIDIA RAPIDS Accelerator, and A100 GPU-based clusters can accelerate Apache Spark workloads by 29X. This type of performance translates into better fraud detection, faster financial transactions, or medical research that could cure diseases such as diabetes or Alzheimer’s.

Figure 1: Processing time for 5 Tbyte fileData science technologies enable companies to leverage analytics, AI, and ML to improve profitability or drive business performance. Customers tell us they need a better approach to harnessing data to accelerate business outcomes.

HPE Ezmeral is a hybrid, open platform that unifies data across different silos into a single, logical data store -- enabling companies to not only harness their data but also increase its quality. Next is the ability to layer on best-of-class open-source tools and certified partner solutions, such as NVIDIA. These types of tools will increase productivity of data engineers and scientists, resulting in insights that drive faster business outcomes.

Register now for Session #42462 at NVIDIA GTC during the week of March 21, 2022 to see how HPE Ezmeral and NVIDIA work together to accelerate time to insights.

Joann Starke

Hewlett Packard Enterprise

twitter.com/HPE_Ezmeral
linkedin.com/showcase/hpe-ezmeral
hpe.com/software

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Accelerate data pipelines to improve data science productivity

JoannStarke

Author

Kudos