Operating System - HP-UX
1748042 Members
4916 Online
108757 Solutions
New Discussion юеВ

Re: Need to determine the failed FAN from the logs

 
Samshen
Frequent Advisor

Need to determine the failed FAN from the logs

Hi,
I have a failed FAN in my server. From the logs I see that:

Monitor.............: dm_chassis
Event #.............: 1296

Probable Cause / Recommended Action:

Cause:
IO Cooling Fan Failed

Action:
Replace IO Fan Module as soon as possible following the IO Fan Module Remove and Replace Procedures.

..............................

Alert Level : 0x4 (Unexpected configuration change
detected.)
Source FRU : 0x6 (platform entity)
Source FRU Detail : 0x4 (card cage fan)
Source ID : 0x5 (platform dependent)
Event Detail : 0x4 (fan failure)
Caller Activity : 0x4 (monitor)
Caller Subactivity : 0x5 (fan)
Activity Status : 0xf (implementation dependent)
Reporting Entity Type : 0x2 (power monitor)
Reporting Entity ID : 0x0 Cabinet 0x0 Cell 0x0 CPU 0x0
Data Type : 0x4 (physical location)
Message ID : 0x3
Domain Name : Partition 0
FRU Source = 0x6 (platform entity)
Source Detail = 0x4 (card cage fan)
Cabinet Location = 0x0
Slot Number = 0x2

I see that the FANs are of two types: System FANs (rear/front) and PCI FANs (inside the server).
Please help me to determine which fan has failed and the slot number of the failed fan.

Thank you.
9 REPLIES 9
Luk Vandenbussche
Honored Contributor

Re: Need to determine the failed FAN from the logs

What is the hardware model of your server?
Mridul Shrivastava
Honored Contributor

Re: Need to determine the failed FAN from the logs

FRU Source = 0x6 (platform entity)
Source Detail = 0x4 (card cage fan)
Cabinet Location = 0x0
Slot Number = 0x2

Card cage fans are the ones inside... rear/fron ones are called I/O FANS.. since here it is written as card cage fan it is the one inside.. normaly there would be 6
Time has a wonderful way of weeding out the trivial
Torsten.
Acclaimed Contributor

Re: Need to determine the failed FAN from the logs

Not possible to suggest the position without knowing the server model.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Samshen
Frequent Advisor

Re: Need to determine the failed FAN from the logs

Hi,
My server model is RP7410.
Hi Mridul,
I saw that there is written "card cage fan" for the source detail, but the main description in the top is event 1296:

Summary:
IO Fan Failed.
Description of Error:
Chassis Code Keyword IOFAN_FAIL.
An IO Chassis cooling fan has failed. Depending on the number of
fans still operating, the cabinet may or may not shut down. View Error Log entries to determine if the cabinet is operating. If many log entries call out entities powering off during the same time frame as this IOFAN_FAIL, the cabinet has probably shutdown. Carefully review the log for the first few events within the same time frame for the root cause of the problem.
The Guardian Service Processor command, PS, will show a detailed
power status for a cabinet. The +48V LED on the Front Panel Board not
lit, power is not enabled to the cabinet, indicating the cabinet IO Chassis fans have
probably gone from N to N - 1 status requiring an immediate cabinet shutdown.
Probable Cause / Recommended Action:

Cause:
IO Cooling Fan Failed

Action:
Replace IO Fan Module as soon as possible following the IO Fan
Module Remove and Replace Procedures.

........................................

So here is what I want to know:
The logged event is #1296, which is related to I/O fans.
In the bottom is written:
Source Detail = 0x4 (card cage fan)

Whose fan is damaged and in which location?

Thank you.
Ludovic Derlyn
Esteemed Contributor

Re: Need to determine the failed FAN from the logs

Hi,

Have you a management card ?
if yes go to GSP, select Command menu (CM) and execute PS

mp0015608a9b01] MP:CM> ps


PS
System Power state: On
Temperature : Normal


Power supplies State
-----------------------------------------------------------
Power Supply 1 Normal
Power Supply 2 Normal


Fans State
-----------------------------------------------------------
Fan1A (CPU) Normal
Fan1B (CPU) Normal
Fan2 (Memory) Normal
Fan3 (I/O) Normal
CPU0 Fan Normal
CPU1 Fan Normal

regards

L-DERLYN
Torsten.
Acclaimed Contributor

Re: Need to determine the failed FAN from the logs

Run the "PS" command from the MP to check the status.

The slot 2 in the PCI area is right side, rear position (viewed from front). Check the LED too.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Mridul Shrivastava
Honored Contributor

Re: Need to determine the failed FAN from the logs

we need to look into the detail section and then the FRU source detail ... that is the standard EMS notification.. So I would still standby my words....

and to confirm the same u can check the PS o/p from GSP
Time has a wonderful way of weeding out the trivial
tkc
Esteemed Contributor

Re: Need to determine the failed FAN from the logs

i/o fan 0,1,2 is at the rear while i/o fan 3,4,5 is at the front :

rear
|=====|=====|=====|
| 0 | 1 | 2 |
|=====|=====|=====|
| 3 | 4 | 5 |
|=====|=====|=====|
front
tkc
Esteemed Contributor

Re: Need to determine the failed FAN from the logs

rear of rp7410
|=======|=======|=======|
| fan 0 | fan 1 | fan 2 |
|=======|=======|=======|
| fan 3 | fan 4 | fan 5 |
|=======|=======|=======|
front of rp7410