AI Insights
cancel
Showing results for 
Search instead for 
Did you mean: 

Solving the HPC storage challenge

BillMannel

The enterprise is rapidly adopting HPC, which means affordable HPC storage is critical. Learn how the HPE Data Management Framework keeps costs under control without sacrificing performance.

In today's world of digital services, intelligent analytics, and broadly distributed cloud infrastructure, data loads that HPC storage-blog.jpgwere once the purview of scientific research and massive government number-crunching are now common in the world of business. Where high-performance computing (HPC) was once a luxury for the enterprise, it's now an important initiative.

But deploying adequate HPC storage poses challenges for the typical enterprise. Scaling up compute and network resources is relatively easy thanks to virtualization, containerization, and other abstract architectures, but storage is different. For one thing, storage media cannot be virtualized in the conventional sense. Every single bit of data must occupy physical space on disk, tape, or a flash memory chip; virtualization can only address network bottlenecks and management tasks. When volumes scale, so must storage capacity.

Managing all that data in an HPC environment also requires increasingly sophisticated tools. Not only must your HPC storage warehouse all this information; it must also categorize it, track it, and retrieve it as quickly as possible. This challenge increases exponentially as automated systems use data to continuously create more data.

Balancing cost versus accessibility

Ideally, the enterprise can instantly access its data at all times, but this isn't a realistic expectation. Flash memory is the fastest storage medium, but building an all-flash architecture would be prohibitively expensive for even the largest, most well-funded enterprise. Disk and tape storage are cheaper, but they're slower.

The key challenge for the enterprise, then, is to strike the right balance between cost and accessibility. This is the guiding principle behind the HPE Data Management Framework (DMF). More than just a simple management platform, DMF integrates a number of key capabilities designed to maintain peak storage performance at the lowest possible cost in dynamic, high-scale environments.

The key components of HPE DMF drive HPC storage efficiency at a granular level. These include:

  • A file system namespace that can be accessed by users and applications
  • Core server nodes and data mover nodes that ingest file system events, store metadata, and manage the transfer of data to back-end storage systems
  • Object- and cloud-based storage built around standard APIs

DMF provides an efficient and highly flexible way to manage data and ensure that storage resources are used efficiently without diminishing performance. One of the ways it does this is through effective tiering.

An all-flash HPC storage environment isn't infeasible solely because it's expensive. It's also overkill, as not all data needs to be instantly available. The value of any given data set can decline for any number of reasons; maybe program objectives change, or maybe the set is rendered obsolete by newer data. This is why most organizations adopt a three-tiered storage ecosystem, using flash for the most in-demand data, disk for data that's accessed frequently but not constantly, and tape for long-term archiving.

The initial challenge with tiered storage is to determine what data belongs in each tier. In an HPC storage setting, this task is monumental given the size of the data load and the speed required. Many organizations are finding that real-time storage tiering is essential—even for petabyte-scale volumes.

Tier management for efficient HPC storage

HPE DMF addresses this problem by defining tiers and automating the process of stratifying data. Using a built-in policy engine that lets users categorize data across a wide range of parameters (including age, subject, and frequency of access), the system can quickly and continuously pinpoint the appropriate medium. At the same time, it leverages historical analysis to apply metadata to existing volumes so that these data sets can be quickly retrieved from their assigned tiers. The system can also declutter storage by identifying and removing old and inactive data.

In addition, the system offers a zero-watt function that consolidates data on as few resources as possible and powers down unused devices. This allows the enterprise to reduce the drain on operating budgets due to energy consumption, management tasks, and other factors while maintaining large amounts of storage.

In essence, DMF creates a frictionless storage environment in which data is moved seamlessly from one type of medium to another based on the needs of the user. It solves the cost-versus-accessibility problem by ensuring that the most active, critical data is assigned to the highest-performance tier and relegating older, less urgent files to less costly resources.

In this age of rapid data movement and high-scale infrastructure, the old methods of managing data cannot keep up. Enterprise performance and profitability will become highly dependent on leveraging data to its utmost potential and driving inefficiency out of IT infrastructure.

We have designed HPE DMF to meet these goals, giving all enterprises the ability to compete effectively in the new economy.


Bill Mannel
VP & GM, HPC & AI Segment Solutions

twitter.com/Bill_Mannel
linkedin.com/in/billmannel/
hpe.com/servers

0 Kudos
About the Author

BillMannel

As the Vice President and General Manager of HPC and AI Segment Solutions in the Data Center Infrastructure Group, I lead worldwide business and portfolio strategy and execution for the fastest growing market segments in HPE’s Data Center Infrastructure Group which includes the recent SGI acquisition and the HPE Apollo portfolio.

Labels