HP 9000
cancel
Showing results for 
Search instead for 
Did you mean: 

Superdome SD3200 BPS failure not showing in parstatus

mvpel
Trusted Contributor

Superdome SD3200 BPS failure not showing in parstatus

We had a BPS fail rather dramatically last week, with a loud pop that startled everyone in the lab despite the dull roar of the air handlers.

We're scheduled for a swap-out of the dead unit shortly, but the failure has exposed a rather odd problem: the "parstatus" command within HP-UX does not show the power supply as failed,even though it does show only five, as opposed to six, units in the OK column.

Likewise, the "CM PS" command shows the dead unit as simply unpopulated, as opposed to marked as failed with an asterisk in the second row of the report.

It also appears that there was no EMS alert when the BPS failed, and nothing turned up in the syslog.

There was a chassis error log recorded at the time of the failure but it was about AC mains being deleted, not a BPS failure, and there was no circuit breaker tripped. And again, no EMS alert from the chassis monitor. There has been no reboot since the failure occurred.

Could the problem be just that the BPS failed explosively, and so went straight from operational status to "unconfigured" status without any indication to the chassis or the EMS that it had failed?

We're running PDC 36.7, PDHC 7.10, GSP 7.30, CLU 7.8, PM 7.16, and CIO 7.4. OnlineDiag is at B.11.11.15.13, and HP-UX is 11.11 with the June 2003 BUNDLE11i.

Thanks for any suggestions you can offer.
5 REPLIES
Michael Steele_2
Honored Contributor

Re: Superdome SD3200 BPS failure not showing in parstatus

Hi

This is obviously a firmware problem and you have been negligent in updating. All of my 5 superdomes show whenever a FRU fails, especially a power supply.

You current state is you are sending out no alarm messages ( firmware ) and have nothing listening ( EMS / diags ).

You also need to update you Online Diags.

Note that superdome firmware upgrades, except for HBA's, are all CE installed. If I remember correclty they can be done without bringing down anything, but verify this with the CE.

The online diags install are the same, no reboot required.

Easy peesy.
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: Superdome SD3200 BPS failure not showing in parstatus

One additional note, if you were to have a major outage then in your current state HP would require you to first upgrade you firmware and diags before their root cause analysis, which they would argue would fail, because of the lack of current versions. So the outage would be prolonged.

By voluntarily upgrading you have the opportunity to first upgrade your dev and test machines before your production boxes. A requirement in most data centers.

In the later, an outage in production followed by mandatory upgrades gives you no opportunity to test how your applications would react. And I guarantee you, all users will push back in such a situation, and make your life uncomfortable.
Support Fatherhood - Stop Family Law
mvpel
Trusted Contributor

Re: Superdome SD3200 BPS failure not showing in parstatus

The firmware updates have to be installed in boot-is-blocked mode.

The antiquity of the versions is tied to the level of caution exercised by the developers with respect to regression testing for the production systems outside the lab. There's been a strong resistance over the years to making any changes whatsoever unless 100% proven absolutely essential, but thankfully they're coming around to the notion of patching the OS on a more regular basis. We have a couple of systems running at 36.8 and a late OnlineDiag, but not all of them, though that should be remedied over the next several weeks.

Thanks for the info!!
Michael Steele_2
Honored Contributor

Re: Superdome SD3200 BPS failure not showing in parstatus

My five superdomes are split 2 into dev / test and three into prod. I had to do it in two months instead of three because dev and test were on the same box in 2 cases.
Support Fatherhood - Stop Family Law
mvpel
Trusted Contributor

Re: Superdome SD3200 BPS failure not showing in parstatus

.