Operating System - HP-UX
1833018 Members
2090 Online
110048 Solutions
New Discussion

Re: HPUX server receiving error. New to HPUX.

 
short_bus
New Member

HPUX server receiving error. New to HPUX.

All,

I just inherited 4 HPUX servers. Last night i received the following error:

Event data from monitor:

Event Time..........: Wed May 2 08:12:00 2007
Severity............: MAJORWARNING
Monitor.............: dm_ses_enclosure
Event #.............: 306
System..............: raptor.hsn.net

Summary:
Enclosure at hardware path 0/4/0/0.8.0.255.0.10.0 : Hardware failure


Description of Error:

One of the two temperature sensors on the enclosure services controller in
slot B is reporting inconsistent data.

Probable Cause / Recommended Action:

One of the two temperature sensors on this controller card is reporting a
status different enough from the other temperature sensors on the other
controller card in the enclosure to indicate the temperature sensor may be
failing. Inspect/verify the controller card.

User Defined Annotation: .

Additional Event Data:
System IP Address...: 10.16.6.167
Event Id............: 0x4638801100000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_ses_enclosure.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800/A500-44
OS Version......................: B.11.11
System Serial Number............: USC41296PF
System Software ID..............: 505760578
EMS Version.....................: A.04.00
STM Version.....................: A.47.00
Latest information on this event:

I dont know how to view environment commands like in solaris. It would be prtdiag -v in solaris. How can I determine if this is a real fault?

Thank You
6 REPLIES 6
Drew Roberts
Valued Contributor

Re: HPUX server receiving error. New to HPUX.

It's pretty much what it says. Two temperature sensors are reporting very different temperatures. If you have a support contract with HP, give them a call. One of them likely needs to be replaced.
Alan Casey
Trusted Contributor

Re: HPUX server receiving error. New to HPUX.

Sounds like a hardware fault, I would log a hardware call with HP.
They will investigate the error with you before dispatching an engineer.
Drew Roberts
Valued Contributor

Re: HPUX server receiving error. New to HPUX.

Let me amend my statement by saying that if it only occurs once, you should be able to safely ignore it. If it occurs frequently, have it looked at.
short_bus
New Member

Re: HPUX server receiving error. New to HPUX.

What is the command that you can use to view this information? I would use prtdiag -v in solaris?
Peter Leddy_1
Esteemed Contributor

Re: HPUX server receiving error. New to HPUX.

Hi,

You can access STM(Support Tools Manager) through a few interfaces(cstm, xstm & mstm) to view your hardware/log files and run some tests also, this is a very usefull tool when diagnosing hardware faults.

Have a look at this tutorial on how to use it.

http://docs.hp.com/en/diag/stm/stt_summ.htm

Regards,

Peter
MHudec
Frequent Advisor

Re: HPUX server receiving error. New to HPUX.

Check the status of enclosure via stm (command line utility is cstm as advised).
cstm->map->sel dev->info->il

# cstm

cstm>map
- find your enclosure device (look for Disk Enclosure in Product field and for it's hw path in Path field)

cstm>sel dev
select it according to it's Dev Num field

cstm>info
- this will gather the information
- wait few seconds

cstm>il
- this will show you infolog with the info like below:


-- Information Tool Log for Disk Enclosure on path 0/2/1/0.8.0.255.0.15.0 --

Log creation time: Thu May 3 04:38:43 2007

Hardware path: 0/2/1/0.8.0.255.0.15.0


Product ID: A6255A


Controller B
------------
FC Loop ID: 255 (decimal)
Enclosure ID: 0
Hardware Path: 0/2/1/0.8.0.255.0.15.0
Serial No: R249K1443006
WW Name (node): 50 06 0B 00 00 0C 6E 17
WW Name (port): 50 06 0B 00 00 19 21 51
Firmware Rev.: HP05

Annotation: AR_02_B


Enclosure Status
----------------

Disk Modules
------------

------------------------------------------------------------
SLOT | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |12 |13 |14 |
BUS ID | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 | 0 | 0 |13 | 0 |
STATUS |OK |OK |OK |OK |OK |OK |NIN|NIN|NIN|NIN|NIN|OK |OK |NIN|OK |
-------------------------------------------------------------

BP - The disk device is not on the FC link. The enclosure has
determined the disk is operating outside expected parameters
and bypassed the disk device. Check the connection, re-seat, or
replace the disk drive.

NIN - There is no disk device installed in this slot.


Controllers Status
----------- -------------
A: OK
B: OK (Reporting Controller)

Fan Modules Status
----------- -------------
A: OK
B: OK

Power Supplies Status
-------------- -------------
A: OK
B: OK

GBIC Status
-------------- -------------
Controller A
------------
Primary: OK
Expansion: OK

Controller B
------------
Primary: OK
Expansion: OK

Voltage Sensors Voltage Status
--------------- ------- -------------
Controller A
------------
3.3v: 3.22 OK
5.0v: 5.08 OK
12v: 12.16 OK

Controller B
------------
3.3v: 3.20 OK
5.0v: 5.08 OK
12v: 12.24 OK

Temp Sensors Temperature Status
------------ ----------- -------------
Sensor 1: 28 (Celsius) OK
Sensor 2: 32 (Celsius) OK
-- Information Tool Log for Disk Enclosure on path 0/2/1/0.8.0.255.0.15.0 --

Those temp sensors are the lines which interests you.