Servers & Systems: The Right Compute
1754314 Members
2716 Online
108813 Solutions
New Article
ComputeExperts

Re: HPE Superdome Flex family earns highest availability rating from IDC

Learn why the selection of a modern, mission critical server platform is critical to your business's success — and why analyst firm IDC gave HPE Superdome Flex servers the top AL4 rating.

HPE-Superdome Flex-IDC.png

Downtime is unacceptable for mission critical workloads. Jobs like enterprise resource planning, inventory management, and customer-facing applications — and the databases that power them — form the backbone of many businesses.

It's also extremely costly. A recent IDC analysis of digital-first strategies found downtime events cost the smallest 20.7% of businesses an average $5,000 to $10,000 USD per hour. For the largest 1.4% of enterprises, the figure rose to $500,000 USD per hour.[i]

Businesses try to protect against failures by running workloads on “mission critical” systems, designed for the highest levels of availability. High-availability servers make up 21-30% of servers at 60% of businesses, across all industries.1

But application downtime still happens, and servers are often the problem: 15.5% of downtime events result from server failure.1 This might be one reason some businesses are reluctant to modernize legacy servers if those servers run without fault. However, modernization is the vital first step in digital transformation journeys that enable faster transaction and analytics processes, as well as new business models and services. Those who fail to modernize are likely to fall behind competitors.

That's why your selection of a modern, mission critical server platform is critical to business continuity and to your bottom line.

The HPE Superdome Flex family was recently named an AL4 level system by IDC, a global provider of advisory services for the IT market. AL4 is the highest Availability Level rating the organization has defined. In this blog we discuss the fault management strategy and features that enable Superdome Flex's high availability, to aid your assessment of modern mission critical platforms.

Being strategic about server faults

Faults are inevitable in IT. Resilient platforms have a strategy for handling and correcting faults, not merely trying to prevent them.

Good infrastructure fault management strategies aim to establish what went wrong, then try to prevent the fault from impacting parts of the IT stack that could cause downtime (such as the OS, database, application and data).

HPE Superdome Flex design is underpinned by a comprehensive reliability, availability and serviceability (RAS) strategy:

1. Establish what went wrong by detecting and logging errors.

2. Analyze the problem including how to prevent the fault reaching higher levels of the IT stack such as the operating system, database, application and data.

3. Repair faults and errors to minimize or eliminate unplanned and planned downtime.

HPE Superdome Flex server AL4 rating

IDC awards four different Availability Level ratings to server systems. AL1 platforms are designed without specific availability features; AL2 platforms provide virtualization and workload balancing solutions to achieve availability; and in AL3 platforms, clustering software facilitates failover to another node in a cluster.

The AL4 rating received by HPE Superdome Flex servers is reserved for fault-tolerant platforms that guarantee continuous processing under any circumstances, with an extensive set of hardware RAS features and redundancy throughout the system.

In its paper, IDC highlights a number of Superdome Flex RAS features that contribute to this maximum rating.

Error detections with unique RAS capabilities across every subsystem. Subsystem RAS features are implemented at the lowest possible level to ensure evidence is collected to detect errors, determine root causes and find correlations between errors. Memory RAS technologies enhance memory reliability and reduce memory outage rates.

Platform RAS provides adaptive routing, with the ability to route traffic around failing or failed links in the system fabric. And HPE Superdome Flex servers implement the full RAS functionality provided by Intel® Xeon® Scalable processors, including innovative error detection and retry mechanisms.

Firmware First stops errors reaching the OS and applications. Errors in memory, the CPU, or I/O channels are contained at the firmware level. Firmware can collect error data and diagnose faults, even when system processors aren't fully functional, for correctable and uncorrectable errors. This enables predictive fault analysis for system memory, CPU, I/O, and interconnect.

The Analysis Engine handles and corrects errors. The Analysis Engine constantly analyzes all hardware for faults. It can predict failures, initiate automatic recovery actions, and notify system administrators and management software about problems. Because it can initiate self-repair without operator assistance, the Analysis Engine reduces human error and increases availability.

Reduce the risk and cost of downtime

Availability is one of the most important factors in choosing a mission critical server platform. Because downtime events cost thousands of dollars per hour, and server failures are among the most common causes of downtime, many businesses choose to run their core workloads on high availability, on-premises platforms.

But not all mission critical servers are equally dependable. With unique RAS capabilities across every part of the platform, HPE Superdome Flex has been independently classified as delivering the highest level of availability.

To learn more about the HPE Superdome Flex family, its RAS capabilities, and IDC's analysis, click the following links:


Diana-Cortes.png

Meet Diana Cortes, HPE Marketing Manager, Data Solutions

Diana has spent the past 24 years working with the technologies that power the world's most demanding IT environments and is interested in how those technologies impact business. A native of Colombia, Diana has held a variety of regional and global roles with HPE in the U.S., U.K., and Sweden. She is based in Stockholm, Sweden. She holds an MBA from Georgetown University. Connect with Diana on LinkedIn

 

Compute Experts
Hewlett Packard Enterprise

twitter.com/hpe_compute
linkedin.com/showcase/hpe-servers-and-systems/
hpe.com/servers

[i] IDC (2023, January). Mission-Critical Platforms Deliver Continuity in the Shift to "Digital First" Strategies.

About the Author

ComputeExperts

Our team of Hewlett Packard Enterprise server experts helps you to dive deep into relevant infrastructure topics.

Comments

Great article, thank you for sharing.