Servers: The Right Compute
cancel
Showing results for 
Search instead for 
Did you mean: 

The unique modular architecture of HPE Superdome Flex: How it works and why it matters

ServerExperts

Is your infrastructure straining to handle demands to process ever-growing data sets? Learn how the unique modular architecture of HPE Superdome Flex delivers extreme performance, high bandwidth and consistent low latency, even at the largest configurations.

MCS Blog.jpgLast December, HPE announced the world’s most scalable and modular in-memory computing platform, HPE Superdome Flex—a compute breakthrough to power critical applications, enable real-time analytics and tackle data-intensive high performance computing (HPC) workloads.

In a series of three blogs, I’ll be taking an in-depth look at the HPE Superdome Flex capabilities that make it unique in the industry and explain how they can add value to your business. To get started, I’m focusing here on the platform’s modular, scalable architecture.

Scaling beyond the capabilities of Intel

Like most x86 server vendors, HPE uses the latest Intel® Xeon® Scalable processor—codename Skylake—in its latest-generation servers, including HPE Superdome Flex. Intel’s reference design for these processors uses the new UltraPath Interconnect (UPI) that limits scaling to 8 sockets. Most vendors using these processors base their server designs on this “glueless” interconnect method, but unlike them, HPE Superdome Flex uses a unique modular architecture that can scale beyond the capabilities of Intel—from 4 to 32-sockets in a single system.

 We did this because we recognized the market need for platforms able to scale beyond Intel’s 8-socket limit, especially today when data sets are growing at an unprecedented pace. In addition, because Intel focuses the UPI on 2- and 4-socket servers, the 8-socket “glueless” servers become bandwidth challenged. Our design delivers high-bandwidth even when you grow the system to the largest configurations.

Price/performance advantages over other systems 

SuperdomeFlex.pngThe HPE Superdome Flex modular architecture is based on a 4-socket chassis that can scale to 8 chassis for a total of 32 sockets in a single-system compute powerhouse. You have many different processor options to choose, from the cost-efficient Gold to the high-end Platinum “flavors” of the Xeon Scalable processor family.

This choice of Gold and Platinum processors delivers great price/performance advantages over smaller systems. For example, in a typical 6TB memory configuration, Superdome Flex can deliver a lower-cost, higher-performance solution than competitive 4-socket offerings. Why? Because of their design, other 4-socket systems are forced to use 128GB DIMMs, which are a lot more expensive than the 64GB DIMMS an 8-socket Superdome Flex can utilize. At this socket count, an 8-socket/6TB Superdome Flex will deliver double the compute power, double the memory bandwidth and double the IO capability—and it will still be more cost effective than a 4-socket/6TB competitive product.

Similarly, for a competitive 8-socket/6TB configuration, Superdome Flex can deliver a lower-cost, higher-performance 8-socket solution. How? While others are forced to use more expensive Platinum processors because of their design, an 8-socket Superdome Flex can use lower-cost Gold processors to give you the same memory capacity.

In fact, of the platforms based on Intel Xeon Scalable processors, Superdome Flex is the only one able to deliver 8-sockets using the cost-effective Gold variant (as Intel´s “glueless” design supports 8-sockets only through the more expensive Platinum type). We also offer a variety of core count choices, enabling you to map the number of cores per processor to your workload requirements, with variations starting as low as 4 cores to as high as 28 cores per processor.

Scaling up: why it matters

The ability to scale as a single system, or scale up, delivers several advantages for those vital workloads and databases HPE Superdome Flex is best suited for. These include traditional and in-memory databases, real-time analytics, ERP, CRM and other OLTP workloads. For these types of workloads, a scale-up environment is simpler and cheaper to manage than a scale-out cluster, and it also reduces latency, increasing performance.

Check out this blog post on the transaction speed when scaling up or out with SAP S/4HANA to understand why scaling up is a much better alternative than scaling out/clustering for these types of workloads. It’s all about speed and the ability to perform at the level required for these critical applications.

Consistent high performance, even at the largest configurations

The Superdome Flex extreme scale is achieved via the unique HPE Superdome Flex ASIC chipset, connecting the individual 4-socket chassis to one another in a point-to-point fashion, as shown in Figures 1 and 2. The HPE Superdome Flex ASIC technology enables adaptive routing, which load-balances the fabric and optimizes latency and bandwidth, increasing performance and system availability. The ASIC connects the chassis together in a cache-coherent fabric and maintains coherency by tracking cache line state and ownership across all the processor sockets inside a directory cache built into the ASIC itself. This coherency scheme is a critical factor in the ability of HPE Superdome Flex to perform at near linear scaling from 4-sockets all the way up to 32-sockets. Typical glueless architecture designs already see limited performance when scaling to as low as 4- to 8-sockets, because of broadcast snooping.Superdome Flex ASIC.jpg

Figure 1. HPE Superdome Flex ASICs Point-to-point connections HPE SuperdomeFlex Figure 1.jpgFigure 2. HPE Superdome Flex 4-socket chassis

Shared memory

In a similar fashion to compute, memory capacity can grow as more chassis are added to the system. With support for 48 DDR4 DIMM slots per chassis, accommodating either 32 GB RDIMMs, 64 GB LRDIMMs, or even 128 GB 3DS LRDIMMs, the maximum per-chassis memory capacity is 6 TB. This gives a fully scaled 32-socket HPE Superdome Flex a whopping total memory capacity of 48 TB of shared memory to support the most demanding in-memory applications.

Extreme I/O flexibility

As for I/O, each HPE Superdome Flex chassis can be equipped with either a 16-slot or 12-slot I/O bulkhead to provide numerous stand-up PCIe 3.0 card options, giving you plenty of flexibility to support a wide variety of workloads. With either I/O bulkhead selection, the I/O design provides direct connections between the processors and the card slots—with no need for bus repeaters or retimers that can add latency or reduce bandwidth. This gives you the best per card performance possible.

Ultra-low latency

Low latency is a key factor driving the high performance of Superdome Flex. Although data exists in local (directly connected to processor) or remote (across chassis) memory, copies of the data can exist in various processor caches throughout the system. Cache coherency keeps the cached copies consistent in the event an operation changes the data. The round trip latency between a processor and local memory is about 100ns. Latency of a processor accessing data from memory connected to another processor over UPI is ~130ns.

Processors accessing data residing in memory in another chassis will travel between two Flex ASICs (always a single “hop”) for a roundtrip latency of under 400ns—no matter if a processor at the top of the rack is accessing data from memory at the bottom. As for bandwidth, Superdome Flex provides more than 210 GB/s of bi-sectioned crossbar bandwidth at 8-sockets, more than 425 GB/s at 16-sockets and over 850 GB/s at 32-sockets. That’s plenty to power the most demanding workloads. In another post, I will expand on the performance topic and share some recent Superdome Flex benchmark results

Why does this extreme modular scalability matter?

It’s no secret data is growing at an unprecedented pace–which means infrastructure strains to handle increasingly demanding requests to process and analyze critical, ever-growing data sets. But growth rates can be unpredictable.

To support the business, IT teams need systems that respond effectively and promptly to their requests, regardless of the amount of data or how fast it grows. Having a platform that keeps pace with the demands of your business will give you peace of mind—so you’ll know that you won’t run out of room to grow, but neither will you need to overprovision.

When you deploy memory-intensive workloads, you might ask: What will my next TB of memory capability cost? With Superdome Flex, you can scale memory capacity without a forklift upgrade, as you’re not limited to the DIMM slots in a single chassis. Also, as the number of users increase, mission-critical applications require a high performing environment regardless of size.

Today’s in-memory databases demand low-latency/high-bandwidth systems. Thanks to its innovative architecture, HPE Superdome Flex delivers extreme performance, high bandwidth and consistent low latency, even at the largest configurations. What’s more, you can get all this for your critical workloads and databases at better price performance than on smaller systems.

One more thing: HPE Superdome Flex has been recently certified to run VMware and Oracle Linux workloads, in addition to the standard RHEL and SUSE Linux distributions. Oracle VM and Windows certifications are expected later this year.

In the second blog in this series, I cover some of the advanced and unique reliability, availability and serviceability (RAS) features of HPE Superdome Flex resulting in five nines (99.999%) single-system availability.

You might also want to check out the HPE Superdome Flex Architecture and RAS technical whitepaper or watch this short video for architecture highlights.

 

Featured articles:


Diana Cortes Headshot.jpg

Meet Servers: The Right Compute Blogger Diana Cortes, Marketing Manager, Mission Critical x86 Solutions, HPE.

Diana has spent the past 20 years working with the technologies that power the world’s most demanding environments and is interested in how solutions based on those technologies impact the business. A native from Colombia, Diana holds an MBA from Georgetown University and has held a variety of regional and global roles with HPE in the US, the UK and Sweden.

About the Author

ServerExperts

Our team of Hewlett Packard Enterprise server experts helps you to dive deep into relevant infrastructure topics.

Comments
cwchan

Hello,

Former SGI customer here. The Flex ASIC is essentially NUMAlink 8? You give latency figures for local, same node, and remote node memory access, but what is NUMAlink 8's (point to point, not aggregate) bandwidth?

Ahmad A Hassan

I have some clarifecations regarding the cluster solution using the superdom flex platform as ainfrastrure .

what is the recommended cluster soltion which was tested and certefied by HPE (Redhat cluster OR HPE service gurd cluster ) ,? and what type of recommended cluster ie P.N cluster OR V.cluster

is there any bundeled cluster soltion available from HPE over Redhat ?

if we are using old HPE RISK server running HP UX 11.33 with service guard cluster for SAP , is there any recommended solution for migrating to superdom flex  from OS and cluster prospective ?

is there any HPE white paper OR best practice talking about this subject ?

is there curently any training available from HPE for HPE flex ?if yes what is the training code number

 

ServerExperts

Hi Ahmad, 

HPE Superdome Flex works with the HPE Serviceguard for Linux high availability and disaster recovery clustering solution. HPE Superdome Flex is certified with RHEL, SUSE, Oracle Linux®, VMware®, and Windows.. For detailed information on the HPE Certified and Supported HPE servers for OS and Virtualization Software and latest list of software drivers available for your customer’s environment, please see the Support Matrix at hpe.com/info/ossupport. For migration questions contact your HPE representative. For documentation on Superdome Flex, including whitepapers, please visit:  https://www.hpe.com/us/en/product-catalog/servers/mission-critical-x86-servers/pip.hpe-superdome-flex-server.1010323140.html and click on “Documents”: https://h20195.www2.hpe.com/v2/default.aspx?cc=us&lc=en&oid=1010323140 . For details on HPE Serviceguard for Linux please visit: https://www.hpe.com/us/en/product-catalog/detail/pip.376220.html.

Thanks,

Diana Cortes

ServerExperts

Hi @cwchan,

Thank you for your question. 

The Flex ASIC is based on NUMALINK technology that continues to be enhanced by HPE. The point to point bandwidth is 100Gb/s per cable, in each direction.

With the topologies supported, we are often using multiple links point to point and with the adaptive routing support we have on the fabric, user data can get spread across multiple links by the hardware automatically to drive a higher aggregate user data rate.

Thanks,

Diana Cortes

Events
See posts for
dates/locations
HPE at 2018 Technology Events
Learn about the technology events where Hewlett Packard Enterprise will have a presence in 2018.
Read more
See posts for dates/locations
Reimagine 2018
Join us at one of the Reimagine 2018 stops and see how we Simplify Hybrid IT, innovate at the Intelligent Edge and bring it all together with HPE Poin...
Read more
View all