HPE 9000 and HPE e3000 Servers
1753773 Members
5204 Online
108799 Solutions
New Discussion юеВ

Re: L3000 randomly powering down

 
DS_5
Occasional Contributor

L3000 randomly powering down

Hello,

My L3000 (aka rp5470) powers down every couple days, without any reason, as I can tell.

I reinstalled it 10 days ago in HP-UX 11.11 June 04, but it didn't change.

The PDC version is 33.xx.
I am about to apply PHSS_30632 (PDC 44.12).

According to the system alerts I get on the console (see below), I first thought of an overheat problem.
But now, it is very well cooled and ventilated.

Do you have any input ?

Thanks a lot,
Amedee


Here are the logs:

************* SYSTEM ALERT **************
SYSTEM NAME: uninitialized
DATE: 10/04/2004 TIME: 23:53:13
ALERT LEVEL: 14 = Fatal power or environmental problem prevents operation

REASON FOR ALERT
SOURCE: 6 = platform
SOURCE DETAIL: 1 = cabinet SOURCE ID: FF
PROBLEM DETAIL: 1 = inlet overtemp

LEDs: RUN ATTENTION FAULT REMOTE POWER
OFF OFF OFF OFF FLASH
System Power is Off.

0x002000E161FF406F 00000000 00000000 - type 0 = Data Field Unused
0x582008E161FF406F 00006809 0417350D - type 11 = Timestamp 10/04/2004 23:53:13
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:Timeout!
*****************************************

************* SYSTEM ALERT **************
SYSTEM NAME: uninitialized
DATE: 10/04/2004 TIME: 23:53:07
ALERT LEVEL: 13 = System hang detected via timer popping

REASON FOR ALERT
SOURCE: 1 = processor
SOURCE DETAIL: 1 = processor general SOURCE ID: 0
PROBLEM DETAIL: 4 = timeout

LEDs: RUN ATTENTION FAULT REMOTE POWER
OFF OFF OFF OFF FLASH
System Power is Off.

0x78E000D41100F000 00000003 00000000 - type 15 = Activity Level/Timeout
0x58E008D41100F000 00006809 04173507 - type 11 = Timestamp 10/04/2004 23:53:07
A: ack read of this entry - X: Disable all future alert messages
Anything else skip redisplay the log entry
->Choice:Timeout!
*****************************************
5 REPLIES 5
Bharat Katkar
Honored Contributor

Re: L3000 randomly powering down

Hi,
Could you check whether FAN's are rotating and are maintaining proper Air Flow inside. Also if the HeatSink are sitted properly on the processor.
Check processor installation thorughly.
Hope that helps.
Regards,
You need to know a lot to actually know how little you know
Ravi_8
Honored Contributor

Re: L3000 randomly powering down

Hi

get into GSP (press ctrl+b) menu
gsp> ps

look for fan status.
log indicating the over temperature
never give up
HGN
Honored Contributor

Re: L3000 randomly powering down

HI

Looks liek the system is generating a lot of heat, maybe the fan some issue and not giving the required ventilation. It is good to have a look at the fan status on the GSP menu as Ravi had mentioned.

Rgds

Gopi
Mauro Gatti
Valued Contributor

Re: L3000 randomly powering down

It seems a overtem problem.
If you want to be noticed when there is a critical overtemp o fanfail condition edit /etc/envd.conf file and add your customized command in CRIT line.

For example:

...omissis
OVERTEMP_CRIT:y
/usr/bin/mailx -s "$(hostname) Overtemp CRITICAL WARNING!" you@yourdomain &1 >/dev/null

OVERTEMP_EMERG:y
/usr/sbin/reboot -qh

FANFAIL_CRIT:y
/usr/bin/mailx -s "$(hostname) Fanfail CRITICAL WARNING!" you@yourdomain &1 >/dev/null

FANFAIL_EMERG:y
/usr/sbin/reboot -qh
Ubi maior, minor cessat!
Ted Buis
Honored Contributor

Re: L3000 randomly powering down

I suppose it is also possible to have a bad sensor. The L3000 can have two or three power supplies, but it needs two to run. Are you plugging two in to separate sources so there is enough power?
Mom 6