HPE 9000 and HPE e3000 Servers
1752505 Members
5565 Online
108788 Solutions
New Discussion юеВ

server showing fault error

 
joseph51
Regular Advisor

server showing fault error

Hi,

I have one PA-RISC server rp4440.The customer is getting the errors like

>------------ Event Monitoring Service Event Notification ------------<

Notification Time: Sat Jul 18 14:03:48 2009

hestau02 sent Event Monitor notification information:

/system/events/ia64_corehw/core_hw is >= 3.
Its current value is CRITICAL(5).



Event data from monitor:

Event Time..........: Sat Jul 18 14:03:48 2009
Severity............: CRITICAL
Monitor.............: ia64_corehw
Event #.............: 104011
System..............: hestau02.hess.pri.com

Summary:
Power Unit : Redundancy lost or not present.


Description of Error:

The number of Power supplies has gone from N+1 (redundant) to N
(non-redundant) if a Power supply was removed, or the number of Power
supplies is < N+1.

Probable Cause / Recommended Action:

The minimum number of power supplies required to power the unit is
currently installed and operating. There are no redundant I/O power
supplies available in case of failure. If redundancy is desired another
Power supply should be added.

For information on the sensor that generated this event, refer to FRU ID
in Event Details section.

Additional Event Data:
System IP Address...: 172.30.33.21
Event Id............: 0x4a620e8400000000
Monitor Version.....: B.01.00
Event Class.........: System
Client Configuration File...........:
/var/stm/config/tools/monitor/default_ia64_corehw.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800/rp4440
EMS Version.....................: A.04.20
STM Version.....................: A.63.00
System Serial Number............: USE4726MXP
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/ia64_corehw.htm#104011

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v


Event Details :

Event Date .............: Sun Apr 26 10:42:06 2009
Sensor Number ..........: 0xcf
Sensor Type ............: Power Unit
Sensor Class ...........: Sensor specific
Sensor Reading/Offset...: 0x02 (Sensor Reading)
Event Type.............: Not Applicable
Entity ID ..............: 21
Generic Message.........:
Power Unit : Power cycle
Entity FRU Id Info......:
power management / power distribution board (Sensor ID: Power Converter)



>---------- End Event Monitoring Service Event Notification ----------<

. I hav eupgraded the STM to A.63.00. But still getting erros


the power status are correct


System Power state: On
Temperature : Status is not available

Power supplies State
-----------------------------------------------------------
Power Supply 0 Normal
Power Supply 1 Normal


Fans State
-----------------------------------------------------------
Fan 0 (System) Normal
Fan 1 (System) Normal
Fan 2 (Pwr) Normal


[hestau02] MP:CM>


the firmware versions are also ok..

[hestau02] MP:CM> sysrev


SYSREV

Current firmware revisions

MP FW : E.03.30
BMC FW : 04.05
System FW : 46.34



please any one give me solution,why the server is showing fault messages

4 REPLIES 4
Michal Kapalka (mikap)
Honored Contributor

Re: server showing fault error

hi,

this is a normal error message, if some power outage happen, so if your power status from MP showing no error, it should be ok, check eventlog in the MP if there is any record about this incident.

mikap
Ganesan R
Honored Contributor

Re: server showing fault error

Hi,

Are you getting this event regularly? If you gets once then it could be some power disturbance or power cable loose connection.

If you are getting this ems event regularly and all the power supply are ok, then it could be some sensor issue. Involve HP to analyse it.
Best wishes,

Ganesh.
Prashanth.D.S
Honored Contributor

Re: server showing fault error

Hi Binu,

This is a false alarm...

Event Time..........: Sat Jul 18 14:03:48 2009 <=== timestamp...
Severity............: CRITICAL
Monitor.............: ia64_corehw
Event #.............: 104011
System..............: hestau02.hess.pri.com


-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v


Event Details :

Event Date .............: Sun Apr 26 10:42:06 2009 <======
Sensor Number ..........: 0xcf
Sensor Type ............: Power Unit
Sensor Class ...........: Sensor specific
Sensor Reading/Offset...: 0x02 (Sensor Reading)

You are seeing the old events triggered by EMS, not sure why these are reported but my suggestion would be to clear the MP logs (SL) both SEL and FPL and monitor for further alerts.

Best Regards,
Prashanth
cnb
Honored Contributor

Re: server showing fault error

As they may be old messages, this still indicates some problems that need attention:

"System Power state: On
Temperature : Status is not available"


The BMC and iLO are are out of date, suggest you update it to 46.34C resolve several critical issues:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=2512015&swItem=pf-57813-1&prodNameId=401769&swEnvOID=54&swLang=13&taskId=135&mode=4&idx=0


hth,