- Community Home
- >
- HPE AI
- >
- AI Unlocked
- >
- Optimizing Spark for Cost Savings on HPE Ezmeral
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Printer Friendly Page
- Report Inappropriate Content
Optimizing Spark for Cost Savings on HPE Ezmeral
Co-authored by Manuel Hoffmann at Pepperdata and Ka Wai Leung at HPE
HPE Ezmeral Runtime Enterprise provides all the tools needed to build, modernize, deploy, monitor, and manage a wide range of AI and analytics workloads to unleash data’s full potential. The solution is powerful, secure, flexible, and has been widely adopted to drive digital transformation and analytics. HPE also offers Spark Operator 3 as a value-added component on HPE Ezmeral, which is based on the downstream and enhanced version of Apache Spark. HPE combines the power and versatility of Apache Spark with the robust, enterprise-grade HPE Ezmeral Runtime Enterprise to support running analytics at scale against large data sources.
But what do you do once adoption scales to the point where dozens or hundreds of data scientists and data analysts are running massive amounts of Spark applications?
According to a recent survey, a third of enterprises report they exceed big data IT budgets by 40% or more. Combined with the current economic situation where everyone is being asked to do more with less, there is a tremendous need for autonomous optimization to increase productivity and lower costs. One way to address this challenge is to ensure the applications are not over-provisioned and are frugally using only the resources they need. Understanding and predicting resource usage for an application is more of an art than a science and can require trial and error. And if you run hundreds, thousands, or sometimes hundreds of thousands of applications daily, those inefficiencies can add up quickly. The answer is autonomous optimization.
HPE is partnering with Pepperdata to bring detailed observability to Spark and to deliver automated, near real-time, autonomous optimization for cluster container resources. This solution is accomplished without the need for Spark developers to change a single line of code.
Developers tend to request excess resources when submitting Spark jobs. While understandable, this trend typically leads to seriously under-utilized Spark clusters. Pepperdata found that only 29.8% of allocated resources for Spark applications are being used within their customer base. Accurately estimating Spark resource consumption without automation is not an easy task and will result in idle resources and a costly cloud bill.
Pepperdata Platform Spotlight helps reduce idle capacity by showing both the allocated resources and the used resources for a given Spark job execution.
This example pictured above demonstrates that very few of the allocated resources are being used.
Understanding Spark applications and the cluster resources they use is one step in the right direction. Pepperdata’s Capacity Optimizer takes it one step further as it minimizes container resource waste by autonomously tuning Spark containers in the background, completely transparently to developers. This solution frees them to focus on the thing they were hired for: developing applications to support business goals.
Capacity Optimizer pairs with the HPE Ezmeral Kubernetes scheduler, telling it how much more load each of the hosts in the cluster can handle. Usually, the scheduler considers the resource allocation parameters to see how many resources are available on a host and determines whether a host can take on more executors. If applications are using less than 30% of their allocated resources, for example, the host can typically manage many more applications.
In addition to allocated resources, Capacity Optimizer also considers used resources and other metrics to make intelligent decisions about a host’s true capacity. If it finds that a host is full in terms of allocated resources, but not full in terms of used resources, it will tell the HPE Ezmeral scheduler to schedule additional Spark executors on it.
An actual customer screenshot below shows that Capacity Optimizer provided a boost of 40% in instance hours saved, a peak container uplift of 92%, and average container uplift of 41%. This equates to significant IT cost savings.
Learn more about HPE Ezmeral and how the HPE Ezmeral ecosystem can help customers take their digital transformation to the next level for their different workloads.
For a free trial of Pepperdata on HPE, please email info@pepperdata.com. Also, check out the Pepperdata interactive demo and Pepperdata webinars and videos.
About the authors:
Manuel Hoffmann leads Pepperdata Partnerships and Business Development. Prior to joining Pepperdata, Manuel was Sr. Director, Strategic Alliances and Partner Development at FICO, where he created the FICO Cloud Center of Excellence dramatically reducing AWS expenses. Prior to FICO, Manuel led global business development, channel sales and marketing functions at early-stage companies. A Swiss native, Manuel holds a BS in electro-mechanical engineering from ECAM (Belgium), a degree in business administration from the University of Leuven (Belgium), and a certificate of international marketing from the University of California, Santa Cruz.
Ka Wai Leung is part of the HPE Software Business Unit’s Partner Enablement team. (See bio above right on this page.)
Hewlett Packard Enterprise
HPEGreenLakeMarketplace/vFunction
- Back to Blog
- Newer Article
- Older Article
- Dhoni on: HPE teams with NVIDIA to scale NVIDIA NIM Agent Bl...
- SFERRY on: What is machine learning?
- MTiempos on: HPE Ezmeral Container Platform is now HPE Ezmeral ...
- Arda Acar on: Analytic model deployment too slow? Accelerate dat...
- Jeroen_Kleen on: Introducing HPE Ezmeral Container Platform 5.1
- LWhitehouse on: Catch the next wave of HPE Discover Virtual Experi...
- jnewtonhp on: Bringing Trusted Computing to the Cloud
- Marty Poniatowski on: Leverage containers to maintain business continuit...
- Data Science training in hyderabad on: How to accelerate model training and improve data ...
- vanphongpham1 on: More enterprises are using containers; here’s why.