HPE 9000 and HPE e3000 Servers
Superdome - Single point of failure?

Hi Guys, I'm reading up on the superdome at the minute and would like to confirm a few things. The single point of failure can be in both:
1. The REO Cables
2. Th backplane

I need to confirm the above but I have two questions to do with the above:
1. If one REO cable fails, will that stop the partition from working or could it bring the whole superdome down?
2. Is the backplane split for redundancy? From what I can see, there is a single backplane that caters for all 8 cells. If the backplane fails, the whole superdome will fail? Do I have this correct?


Re: Superdome - Single point of failure?

The three single point of failures in Superdome are:

The system Backplane
The HUCB board
The UGUY board

If any one in the above fails, Dome will be down.

Re: Superdome - Single point of failure?


1. REO (or RIO) cables run from the cell IO controller to a card cage - if one of these cables is damaged (can't really 'fail' as its not an active component), then you will lose all the IO to that card cage - so that *might* take out your partition (nPar), but shouldn't harm any other nPars in the superdome.

2. Assuming we are talking about an sx2000 superdome then yes, there are actually 3 seperate backplanes and the cell controllers on each cell board are capable of link-level retry in the event of a backplane failure. THis has removed the backplane as a single point of failure.

IT2007 has identified the other SPoFs for you (UGUY and HUCB)



Bharath PSB

Re: Superdome - Single point of failure?

If there is a problem with REO cable when the dome is up, it will surely bring the superdome down to BCH follwed by a reboot. REO cable connects the cell cotnroller to the IO expansion box. If you are using a sx1000, Pinnacles superdome, then you will not be able to reach BCH unless you replace the REO cable. In Sx2000, each cell has a built in console and so you will be able to reach to BCH even without cell being connected to IO.
In Sx1000, backplane doesn't have redundancy logic built in. So, if one link fails from a cell, there is no way you can talk to that cell. In Sx2000, you have backplane redundancy logics. Hence even if one link fails, there are 3 more links connected to that cell. :-)