Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
Operating System - Tru64 Unix
cancel
Showing results for 
Search instead for 
Did you mean: 

DS20E Power supply failure

SOLVED
Go to solution
Geert Van Pamel
Regular Advisor

DS20E Power supply failure

I have a DS20E which halted with a "Power Supply failed" alert.

It has been running for 10 years 24/7 without any issue.


evmget -f "[age < 1d]" |evmsort |evmshow -t "@timestamp @@" |pg

10-Jun-2010 05:14:18 vmunix: Power event reported, 2 power supplies present,
Power Supply 1 failed, Power Supply 2 failed, system will power down under hardware control


evmget -f "[age < 1d] & [priority = 500] & [name sys.unix.syslog.kern]" |evmsort |evmshow -d |pg

Event Data Items:
Event Name : sys.unix.syslog.kern
Priority : 500
PID : 578
PPID : 1
Event Id : 10318
Timestamp : 10-Jun-2010 05:14:18
Host IP address : 192.168.14.9
Host Name : hbsitdtk
User Name : root
Format : vmunix: Power event reported, 2 power supplies
present, Power Supply 1 failed, Power Supply 2 failed,
system will power down under hardware control
Reference : cat:evmexp.cat:200


I have replaced the 2 power supplies.

And the system booted again normally.

The power supply red LEDs were never ON.


I believe I have 4 questions:

1/ I suppose the "red LEDs" should have been ON when the power supply would really be physically defective?

2/ How can I know that the supposedly bad power supplies that I removed are really defective?

3/ I could put them into another spare server and issue the "show power" SRM console command?

4/ Or do there exist other (firmware) tests?
4 REPLIES
Mark...
Honored Contributor
Solution

Re: DS20E Power supply failure

Hi,
10+ yrs is good don,t you think !!
So you have a 2 PSU DS20e, which probably means that 1 of the PSU's failed a while ago but because the system was N+1 you did not notice till the 2nd PSU failed and then no system.
To answer your questions:
1: I have never seen the PSU leds "on" on my ds20e. Normally with alphaservers if the PSU had an led it was green and "on" as long as the PSU was OK. I believe these ones can on if it failed but as I said mine have never failed!
2: Well the box failed so something was wrong. Being as you stated that it worked again after you replaced both of them what more proof do you require ??? If you want the ds20e has room for 3 psu's so you could stick a bad one in the 3rd slot and then check it out from the console with a show power cmd or from the RMC prompt and the env cmd.
3: See answer to 2
4: Yes - when you power the system on it checks the psu's. You stated the the system did not power up so it was not able to test them.

Now that the system is working I would suggest that you periodically check the status of your PSU's with show power or env as I mentioned earlier because if 1 psu fails again and you do not notice then you will when the 2nd one fails - as you have found out!!
Mark...
if you have nothing useful to say, say nothing...
Geert Van Pamel
Regular Advisor

Re: DS20E Power supply failure

Thanks, Mark, interesting answers.

However I do not have the "env" command on RCM.

RCM>env

*** ERROR - unknown command ***

Do I perhaps have an older version of the firmware?

So I cannot see the PSU status from RCM?

RCM>status
Firmware Rev: V2.0
Escape Sequence: ^]^]RCM
Remote Access: DISABLE
Alerts: DISABLE
Alert Pending: NO
Temp (C): 37.0
RCM Power Control: ON
RCM Halt: Deasserted
External Power: ON
Server Power: ON

RCM>help
alert_clr
halt
haltin
haltout
help or ?
poweroff
poweron
quit
reset
setesc
status

RCM>quit

Focus returned to COM port

I have tried to boot the "old" PSUs in another DS20E but there I get 6 beeps because there is no memory, so the SRM console cannot start...

I could of course shutdown the working system, switch to the SRM console, and stick each of the suspected failed PSU in the 3rd slot and issue the "show power" command...
Geert Van Pamel
Regular Advisor

Re: DS20E Power supply failure

Oh, yes, and anyway the old suspected PSUs powered the other DS20E (without memory) during at least 15 minutes... could there have been a bad electrical contact in the connectors causing the original alert?
Mark...
Honored Contributor

Re: DS20E Power supply failure

Hi Geert,
Sorry - touch of brain fade...
"env" can be used on DS25 !! (ES40/45) not ds20e.
You can still do show power from console which should come back with ok/fail for a good/bad PSU.
Hmm, if the server had been off for a while it would them warm up over 10/15mins so it could be temperature related issue or as you say a poor contact.
Mark...
if you have nothing useful to say, say nothing...