Re: Seamlessly scaling HPC and AI initiatives with HPE leading-edge technology

BillMannel · ‎06-13-2019

Accelerate your HPC and AI workloads with new products, advanced technologies, and services from HPE.

Key takeaways

Enable on-demand and elastic provisioning of GPUs, allocate right-sized GPU resources for multiple workloads, and deliver significant cost savings with new GPU-as-a-Service from BlueData
Gain business efficiencies, improve deployment, and reduce time and resources with the latest advancements in HPC infrastructure
Solve complex problems faster and more efficiently with HPE next-gen multi-node systems

A growing number of commercial businesses are implementing HPC solutions to derive actionable business insights, to run higher performance applications and to gain a competitive advantage. In fact, according to Hyperion Research, the HPC market exceeded expectations with 6.8% growth in 2018 with continued growth expected through 2023.¹

Complexities abound as HPC becomes more pervasive across industries and markets, especially as you adopt, scale and optimize HPC and AI workloads. HPE is in lockstep with you along your AI journey. We help you get started with your AI transformation and scale more quickly, saving time and resources.

HPE and our global partners collaborate, build, validate, and deliver software and hardware solutions that enable you to accelerate compute in the way that works best for your infrastructure and application choices.

At the upcoming June International Supercomputing Conference (ISC) in Frankfurt June 17-20, HPE and our partners will showcase some of our hardware and software solutions that reduce complexity in HPC and AI. Let’s take a closer look.

Allocate and provision GPUs dynamically with GPUaaS

Typically, AI workloads demand a lot of compute horsepower, requiring IT teams to look for new ways to boost performance and accelerate their compute. There are numerous server technologies to help accelerate performance including FPGAs, coprocessors and GPUs. But they all come with inherent complexities and challenges. For example, it can be difficult for IT teams to meet the growing demands for GPUs from multiple data science teams, addressing disparate AI applications and use cases. A productivity issue surfaces when IT administrators must manage a job scheduling queue to access the GPUs; simply because they provisioned and deployed a GPU server for a dedicated application.

To address this challenge, HPE BlueData announced a new GPU-as-a-Service (GPUaaS) offering. GPUaaS delivers on-demand and elastic provisioning for GPU-accelerated applications while sharing and allocating GPU infrastructure resources across multiple applications. HPE enables GPUaaS to increase business agility, optimize GPU utilization, and increase ROI for GPU infrastructure by combining HPE Apollo Systems and BlueData software for a cloud-like experience on premises.

The Living Heart Digital Twin enabled by Hybrid HPC

As part of data center modernization, many businesses look to Hybrid HPC to become more agile and less exposed to upfront investment—while having the ability to scale up or down as needed. In order for Hybrid HPC to integrate effectively into your organization’s HPC infrastructure, HPE provides the substantial expertise needed to ramp it up, optimize configurations, and troubleshoot for your unique set of workloads.

HPC systems from HPE provide a choice of servers that incorporate Intel HPC foundation components ensuring a balanced, high-performance, and scalable HPC environment. Scaling from small (HPE Apollo 2000) to medium (HPE Apollo 6000) to large supercomputers (HPE SGI 8600), these systems leverage key technology components such as 2^nd Generation Intel® Xeon® Scalable Processor, Intel OA switches, Intel Optane SSD, and Intel 3D NAND SSDs

A Hybrid HPC cloud-enabled solution is the Living Heart Digital Twin. Here we are redefining the Life Sciences value chain and call for a “cyber infrastructure” to provide all stakeholders the on-demand compute and storage capacity needed to accelerate medical innovation and discovery. At the core of this “cyber infrastructure” sit models like the Living Heart on the 3DEXPERIENCE platform which, combined with Hybrid HPC capabilities by HPE and UberCloud, can enable compute-intensive technological breakthroughs.

Truly solving the world’s most complex problems

HPE SGI 8600 System brings petaflop speed and scalability to thousands of nodes in an efficient, dense, and easy-to-manage proven architecture that currently supports some of the most powerful supercomputers in the TOP500.

Utilizing industry-standard commodity building blocks, such as 2^nd Generation Intel® Xeon® Scalable processors, Mellanox InfiniBand fabric, Intel® Omni-Path Architecture fabric, and open source Linux operating systems, lets HPE ride prevailing technology curves and keep costs down.

The HPE SGI 8600 System is a liquid-cooled, tray-based, high-density clustered computer system designed to deliver the utmost in performance, scale, and density. The basic building block of this system is the E-cell, a sealed unit that uses closed-loop cooling technology that does not exhaust heated air into the data center. A direct attached liquid cooled “cold sink” provides for efficient heat removal from high power devices via an auxiliary cooling distribution unit (CDU).

By concentrating innovations where they matter most, HPE is able to deliver a standards-based high performance compute cluster in an exceptionally dense, efficient, and easily managed solution.

Supercomputing performance just got better!

While compute power is the engine that drives HPC and AI, the supporting infrastructure must form a holistic solution. Here are a few new enhancements:

Accelerated distributed computing with the new Mellanox HDR 200 Gb/s fabric offering, the fastest available interconnect for compute and storage²
Cost effective, higher fabric density with Mellanox HDR 100 Gb/s fabric
Enhanced storage density with support for the new HPE D8000 106 drive disk array in the Scalable Storage for Lustre Solution
Shorten time to revenue with higher throughput using Intel FPGA PAC D5005 acceleration platform (available in Aug 2019 on the HPE ProLiant DL380 Gen10).

In-memory HPC with next-gen HPE Superdome Flex

HPE Superdome Flex is an advanced SMP (symmetric multiprocessing) system within the HPE HPC product family that enables scientists and engineers to solve complex, data-intensive problems holistically at unparalleled scale and with single-system simplicity. From genomics, fraud detection and CAE workloads, to risk management, large-scale data visualization, and running next-generation in-memory databases, Superdome Flex alleviates delays due to jobs competing for fat nodes, enabling more jobs to be completed in less time. It also frees scientists and engineers from cluster administration and balancing workloads so they can focus on discovery.

Featuring a unique, modular, scale-up architecture and designed with memory-driven computing principles, our next-generation Superdome Flex scales from 4-32 2^nd generation Intel® Xeon® Scalable processors in 4-socket increments, and provides from 768 GB to 48 TB of shared memory for better performance and faster analytics. And with unbounded I/O, optimum compute, storage, and network flexibility, and extreme availability, Superdome Flex further equips HPC teams to go further, faster, and with optimum cost-efficiency.

Unlocking the power of AI

A better way to process voice data

Natural language processing (NLP) is a field of AI that helps machines understand human language as it’s naturally spoken. Current NLP tools are hindered by relying heavily on properly articulated and formatted sentences to “understand” what is in a file. In addition, the heavy processing demands of most voice solutions drive organizations to rely on the cloud to record and store voice data, making data security critically important.

HPE is working with Intelligent Voice (IV) to transform what is often chaotic and unstructured audio data into a rich set of semantic data allowing instant insight and intelligence. With IV solutions built on HPE and NVIDIA infrastructure, such as, HPE Edgeline EL4000 Converged Edge System and NVIDIA® T4® GPUs for the edge and HPE Apollo 6500 and HPE ProLiant DL380 Servers and NVIDIA® V100 Tensor Core GPUs in the data center, we are working together to deliver speech and NLP solutions.

These scalable solutions can be deployed on a single edge-based server for off-network or localized use cases or in a redundant, multi-tier environment to maximize scalability and throughput. This allows your organization to have complete control over security, privacy, and jurisdiction, while allowing your workforce, customers, and partners to interact with devices with the utmost privacy, for greater confidence and a more robust user experience.

Cutting-edge solutions for AI workloads

Successful AI depends on having solutions that can scale and adapt across workloads and as work evolves. HPE expertise, technology and partners are on the cutting edge with a comprehensive portfolio of hardware, software, and services synchronized to adapt to your needs and built on a proven global AI ecosystem, ideal for strategic planning and performance optimization.

HPE has partnered with Intel in the development of the 2^nd Generation Intel® Xeon® Scalable processors, and takes advantage of innovations such as Intel® Deep Learning Boost (Intel® DL Boost) to accelerate AI applications in the HPE Apollo and HPE SGI systems. These new instructions help to boost AI performance up to 14⁴ times compared to the previous generation at launch. With the new Intel DL Boost, you can easily run HPC, AI, and Data Analytics workloads on the same system without needed to support separate hardware clusters for each workload.

For those with well-defined artificial intelligence workloads that need acceleration, with eight high-performance NVIDIA V100 Tensor Core GPUs per server, the HPE Apollo 6500 Gen10 System provides superior performance per dollar for HPC and AI workloads—delivering up to 62 TFLOPs double-precision and up to 1 PFLOPs of mixed-precision AI compute. Purpose-built for accelerated computing, this platform features both PCIe and NVIDIA NVLink GPU interconnects, providing the flexibility to suit a wide variety of accelerated computing requirements.

AI and HPE service options for better business outcomes

HPE Pointnext services help you to explore, experiment and evolve your data, artificial intelligence, HPC projects. You can get started with a one-day strategic workshop or you can pilot your use case such as prescriptive maintenance for asset management, image-based quality assurance, autonomous cars or natural language processing.

HPE GreenLake also delivers a pay-per-use high-performance compute experience on premises. You can design you own HPC infrastructure solutions, selecting from a broad range of HPE and partner technologies, as well as optional services that can span your infrastructure, apps, and workloads.

Delivering the best HPC Infrastructure

Comprehensive and secure management

HPE Performance Cluster Manager delivers a fully integrated system management for all Linux^®-based HPE high performance computing systems in both on-premises and in the cloud. The software provides all functionalities you need to manage clusters of any size all day everyday reducing time and resources spent administering your HPC systems. The new HPE Performance Cluster Manager 1.2 enhancements we are going to present at ISC include Cluster Health Management and Active-Active High Availability (HA) features.

Deploy quickly and confidently

HPC is focused on accelerating the time to insights. Increasing the performance of highly parallel HPC applications is about scalability across compute nodes, not just algorithmic efficiency. HPE Apollo and HPE SGI systems based on the 2nd Gen Intel Xeon® Scalable processors deliver high performance for a range of HPC applications scaled beyond four nodes. Together HPE and Intel based systems offer outstanding performance with a unique combination of compute, compute density, memory bandwidth, balanced I/O, platform technologies, and real-world performance powering the most compelling platforms available for HPC today.

HPE helps you to get up and running faster with your preferred HPC and AI applications. We make this possible with the recently announced N VIDIA N GC-Ready HPE Apollo 6500 Gen10 system for the latest GPU-optimized AI software containers. Our NGC-Ready validated systems provide a replicable way to roll out AI and HPC applications from development to production so users can deploy with confidence and gain unprecedented performance.

The NGC-Ready program has been expanded to include our bestselling server, HPE ProLiant DL380 Gen10 server³. You can now gain maximum utility for DL, ML, HPC, and virtualization with the powerful combination of the HPE ProLiant DL380 and the acceleration of the NVIDIA T4, the world’s first GPU for accelerating mainstream enterprise service.

Innovative technologies

The future of compute

The HPE ProLiant DL385 Gen10 is a secure platform that excels with virtualization, memory intensive and HPC applications due to the high core count and large memory footprint. Customers, such as the University of Notre Dame Center of Research Computing, leverage these servers to drive better HPC density and faster results. The next-generation HPE ProLiant DL385 will be the base component of the Cadet project that enables a prototype Gen-Z hardware collaboration platform to provide participating customers and ecosystem partners’ early access to a performant Gen-Z testbed for SW development, demonstration and exploration.

Providing powerful alternatives to the status quo

The HPE Apollo 70 Arm-based system has hit an exciting milestone since establishing the Catalyst UK initiative last April, bringing together Arm, Marvell®, Mellanox and three research universities to build out the Arm ecosystem. The HPE Apollo 70 is an air-cooled, Arm-based Marvell ThunderX2® server, connected with the InfiniBand network used by a variety of customers like Sandia National Labs with Astra, the world’s largest ARM-based supercomputer that made it on the TOP500. HPE, Marvell and Cadence also partnered to deliver an EDA solution running on HPE Apollo 70 that significantly reduces cost and improves productivity. A future concept that will be shown at ISC is a liquid cooled Apollo 70 tray, which will allow for the use of higher power processors.

Joining us at ISC?

If you are headed to Frankfurt, here are key highlights of what’s going to be on display at the HPE Booth D-1130 at ISC 2019. Subject-matter experts will be on hand to help you learn more about these innovative new additions to our HPC and AI portfolio. You can also attend the following speaking sessions:

Vendor Show Down, Monday, June 17^th from 13:15 - 13:25 in Panorama 3 with Carlos Rojas, Manager HPC/AI Portfolio Strategy Team
Exhibitor Forum, Tuesday, June 18^th 13:40-14:00 in booth N-210 with Dr. Ben Bennett, Director HPC Strategy

I am looking forward to another successful ISC and the opportunity to demonstrate our leadership in innovation as we deliver leading-edge technology.

Bill Mannel
VP & GM, HPC & AI Segment Solutions

twitter.com/Bill_Mannel
linkedin.com/in/billmannel/
hpe.com/servers

¹Hyperion Research Quick Take: HPC Server Market Beats Forecast for Full-Year 2018, April 2019

²HDR adapters, cables and switches for HPE Apollo and HPE ProLiant servers will be available in August 2019, while a new hybrid HDR switch will be available in June 2019 for the HPE SGI 8600

³HPE internal: FY18 server shipments as of 2/25/19

4 https://www.intel.com/content/www/us/en/benchmarks/server/xeon-scalable/xeon-scalable-artificial-intelligence.html

Marty Poniatowski · ‎06-17-2019

Great article Bill with a lot of HPC and AI advancements.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Seamlessly scaling HPC and AI initiatives with HPE leading-edge technology

BillMannel

Author

Kudos