Advantage EX
cancel
Showing results for 
Search instead for 
Did you mean: 

The tale of different data architectures for AI/ML and analytics

Learn how the innovative, two-tier architecture of HPE Data Node for AI provides the maximum level of architectural flexibility to support continuously evolving artificial intelligence and machine learning requirements.

HPE-Data Store-blog.png

As enterprise digital transformation moves from one stage to another, artificial intelligence (AI) and machine learning (ML) data architectures must evolve accordingly. Companies need a flexible and effective data platform architecture to go along with their evolving AI/ML and analytical needs. The HPE Data node for AI is a data store solution providing this adaptive approach for AI/ML data architecture.


This post continues the series of blogs dedicated to data stores for AI and advanced analytics. In case you missed them, here are the previous blogs: Choosing the right platform and HPE and WekaIO provide the superfast data platform you need to train AI models


A fair amount of debate centers on whether data store infrastructures for AI/ML and analytics should be in the public cloud, on premises, or hybrid. This discussion is from the past. These scenarios are not in opposition to each other. They are complementary! Data is spreading and will continue to be spread across clouds, data centers, and even edge devices.  Consequently, any data store platform—especially the ones supporting AI/ML and analytics environments—must be capable to run all these scenarios together.

The real selection criteria for an AI/ML data platform should not be where data is but rather how to obtain the best balance between capacity (cost per GB stored) and performance (cost per GB of throughput). Indeed, the specific nature of AI/ML/DL algorithms and GPU-based computers demand huge cost-effective capacity (training a neural network can require up to PB of data), as well as  high throughputs (a GPU can crunch data at 10GB/s and even more). Data platforms for AI/ML/DL workloads have to provide high performance, huge capacity, and be cost-effective.

But how can you effectively balance all these competing needs?

As shown in my previous blog, data pipeline architecture complexity can range from a fast, single multi-purpose scale-out file system (SDS) to a combination of hyper-fast parallel file system and data lakes or object store. But, in general, we can identify three main architectural categories:

  • Integrated multipurpose architecture—A single data store platform providing both fast file system (e.g. all-flash NFS) and large capacity. This kind of solution is viable and cost/performance effective only for hybrid small/medium size AI/DL and traditional workloads.

Figure 1. Disaggregated data architecture for AI and analyticsFigure 1. Disaggregated data architecture for AI and analytics

  • Disaggregated (or two-tier) architecture—This architecture combines high-performance and high capacity data stores. The first platform is a fast scale-out parallel file system containing all active data used for front-end operations (e.g. streaming data to the AI compute nodes). The second platform is a scale-out object store or a data lake, responsible to keep all data. The data movement between the two data stores, to get the right data available where and when it is needed, is managed manually or using data management systems. This architecture is very flexible and allow to manage any size of AI/ML workload, but the physical separation of the components makes operations more complicated.
  • Integrated (or single-system) architecture—In this architecture, the high-performance parallel data store is logically and physically integrated with large capacity store, generally an object-based data store. (scale-out object stores are usually preferred as capacity store because of their scalability characteristics, rich metadata, and competitive cost.) A single system simplifies infrastructure and operations hiding data movements internally (e.g. internal tiering mechanism) or providing other techniques to manage data movement. Single-system architectures are easier to expand, tune, and manage, but they lack the flexibility of two-tier systems in terms of hybrid installations

Figure 2.  Single-system data architecture for AI/ML and analyticsFigure 2. Single-system data architecture for AI/ML and analytics

The best of both worlds: HPE AI Data node—integrated two-tiered data storage for deep learning and high-performance computing

  • HPE Data Node for AI is an HPE solution that combines HPE Scalable Object Storage with Scality RING and WekaFS  parallel file System, running on HPE Apollo 4200 Gen10 Server. The HPE Data Node for AI offers an integrated data platform that provides both the capacity tier for huge data volume, as well as the performance tier supporting the throughput requirements of GPU-based servers.
  • HPE Apollo 4200 Gen10 Server is designed for big data and analytics, software-defined storage, and other data storage-intensive workloads density. It provides an optimized hardware platform for both the Weka  flash-optimized parallel file system, as well as the Scality RING object store. This configuration combines both software stacks into a single cluster of nodes which has been tested and optimized by HPE labs.
  • WekaFS  is a super-fast, flash-optimized parallel file system for AI and technical compute workloads. WekaFS provides automatic tiering of your cold data to Scality RING object storage to deliver low cost and limitless scale.
  • HPE Scalable Object Storage is an HPE solution composed by Scality RING running on HPE Apollo 4000 System. It combines object storage software and industry-standard servers to provide low-cost, reliable, flexible, centralized management for huge-scale unstructured data. The HPE Scalable Object Storage has a lower TCO than traditional SAN and NAS storage vendors at a petabyte scale and provides greater data protection for current and future large-scale storage needs.

With the HPE Data Node for AI, you can have high-performance, petabyte-scale storage solutions with integrated data lifecycle management, providing tiered management by file system and a single namespace. The data lifecycle management features built into the Weka file system automatically identified colder data elements and tier them to S3-compatible Scality RING object storage. The entire data set is protected with a distributed data protection scheme across a cluster of servers. This integrated solution is a full-function high performance AI file store with an integrated and durable low-cost object storage tier, offering savings of up to half the infrastructure and operational costs of traditional solutions that deploy two separate storage clusters.

HPE Data Node for AI can be implemented in both integrated and disaggregated architectures:

  • Classic two-tier architecture—The HPE DL360 Gen10 servers are the hardware infrastructure tier dedicated to high-performance all-flash Weka FS. The HPE Apollo 4200 or 4510 Gen10 servers are the second tier where the HPE Scalable Object Storage runs.

Figure 3. AI Data Node disaggregated (or two-tier) solutionFigure 3. AI Data Node disaggregated (or two-tier) solution

 

  • Integrated architecture—Here, both tier elements are combined into a single Apollo 4200 scalable cluster. The Apollo 4200 is provisioned with both NVMe flash capacity for Weka File System tier and HDD scale-out bulk data storage for Scality RING S3 tier—all in a single clustered system. By using a converged solution, rack footprint, complexity, and power and cooling are all minimized, as well as allowing a faster deployment.

Figure 4. AI Data Node aggregated solutionFigure 4. AI Data Node aggregated solution

 

HPE Data Node for AI: an innovative, two-tier integrated data store solution

HPE Data Node for AI provides the maximum level of architectural flexibility to support continuously evolving AI/ML requirements. Enterprises can quickly start with the integrated HPE Data Node for AI and evolve later to a disaggregated approach when they will need more flexibility between performance and capacity stores, or when they will need hybrid cloud environments. The HPE Data Node for AI allows companies to painless align their data platform architecture to the evolution of company’s AI/ML needs.

Stay tuned for more blogs in this series for further discussions on the HPE data store solutions for AI.

Learn more now

Resources and additional links

Weka file system

Scality RING

HPE Apollo 4200


Andrea Fabrizi
Hewlett Packard Enterprise

twitter.com/HPE_Storage
linkedin.com/showcase/hpestorage/
hpe.com/storage

twitter.com/hpe_hpc
linkedin.com/showcase/hpe-ai/
hpe.com/info/hpc

0 Kudos
About the Author

AndreaFabrizi1

Andrea is Senior Product Manager for Big Data and Analytics Solutions at HPE.

Events
Starting June 23
HPE Discover Virtual Experience
Joins us for HPE Discover Virtual Experience live and on-demand
Read more
Online Expert Days - 2020
Visit this forum and get the schedules for online Expert Days where you can talk to HPE product experts, R&D and support team members and get answers...
Read more
View all