Integrity Servers
1754014 Members
7538 Online
108811 Solutions
New Discussion

Re: rx2660 halted by itself

 
Robert_Jewell
Honored Contributor

Re: rx2660 halted by itself

The following error means that a voltage rail in the server has gone out of acceptable range.

 

0x20511262EF020B20 FFFF030760020300  VOLTAGE_DEGRADES_TO_NON_RECOVERABLE

This does not necessarily mean a power supply fault as there are a number of voltage regulators within the server.

 

Using the IPMI Event Viewer the error tells the following:

 

Keyword
VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
 
Summary
Voltage degraded to non-recoverable level - Check all boards with this voltage.
 
Description
Voltage degraded to non-recoverable level from less severe - Check all boards with this voltage
 
Generator
Baseboard Management Controller

Sensor Type
Voltage

Sensor Number
96

Cause
The voltage in the server has gone outside the factory set range. A bad component, blown fuse, poorly seated module, loose cable, or debris could be responsible for this failure.
 
Action
When this condition was detected the system should have been immediately shutdown to avoid damage. Contact your HP support representative as soon as possible to have the unit checked. Check all boards, power supplies, and modules that either supply or use this voltage rail.


 

The voltage sensor ID is 96 (or 0x60).  It is the location of this sensor that is the cause of the problems.

 

If you have HP support they can confirm this, but I would think this has to do with the system board or processor voltages since there isnt much else in the rx2660!

 

-Bob

----------------
Was this helpful? Like this post by giving me a thumbs up below!
Holger G
HPE Pro

Re: rx2660 halted by itself

Hi Sakthi,

 

I agree to Bob, the power issue is on sensor 0x60.

This sensor is for a voltage on the Front Side Bus (processor bus), is generated on the system board and is consumed by both CPU modules.

So most suspect is the system board, but even one of the CPUs could cause this.

You may contact HP support to replace the system board, if this system still has warranty or you have an HP contract for it.

You may even remove one CPU (if you have two installed) to check if this solves the issue already.

 

Regarding your firmware question:

It is normal that only one firmware bank will be updated during firmware update. So the non-active firmware bank may show an older version. - That is ok.

 

 

Kind Regards

Holger

I am an HP employee.
Was this post useful? - You may click the KUDOS! star to say thank you.

- I work for Hewlett Packard Enterprise -
PrashanthShetty
Occasional Advisor

Re: rx2660 halted by itself

Hi,

 

This could very well be an CPU issue. 

 

Check the status of the CPU by MP:CM>ss command.

 

Best Regards,

Prashanth

 

HP Employee..

unix_shiv
New Member

Re: rx2660 halted by itself

Hi Holger,

 

Our rx2660 machine with single processor can i put it on second slot to boot the machine.

 

Holger G
HPE Pro

Re: rx2660 halted by itself

Hi Sakthi,

 

no, the system will not boot with only a CPU in socket 1. There has to be a CPU in socket 0.

So, if you have only one CPU installed in your rx2660, you cannot test the CPU by moving it.

 

 

Kind Regards

Holger

I am an HP employee.

- I work for Hewlett Packard Enterprise -
DougK
Visitor

Re: rx2660 halted by itself

I am running into the same error message on an HP Rx-2660.  

 

Here is the output of the MP System Event Logs:

 

#  Location|Alert| Encoded Field    |  Data Field    |   Keyword / Timestamp
-------------------------------------------------------------------------------
0     BMC      2  0x20526EDDCA020010 FFFF027000120300 SOFT_RESET
                                                      28 Oct 2013 21:57:30
1     BMC      2  0x20526EDDCA020020 FFFF006F04140300 POWER_BUTTON_PRESSED
                                                      28 Oct 2013 21:57:30
2     BMC      2  0x20526EDDCA020030 0401A37004120300 CHASSIS_CONTROL_REQUEST
                                                      28 Oct 2013 21:57:30
3     BMC      2  0x20526EDDD1020040 FFFF027000120300 SOFT_RESET
                                                      28 Oct 2013 21:57:37
4     BMC      2  0x20526EDDD1020050 FFFF010943090300 POWER_UNIT_ENABLED
                                                      28 Oct 2013 21:57:37
5     BMC     *5  0x20526EDDD2020060 FFFF03076F020300 VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
                                                      28 Oct 2013 21:57:38
6     BMC     *7  0x20526EDDD2020070 6F00A8706F120300 SHUTDOWN_OR_RESET_ON_SENSOR
                                                      28 Oct 2013 21:57:38
7     BMC      2  0x20526EDDD2020080 6F00A3706F120300 CHASSIS_CONTROL_REQUEST
                                                      28 Oct 2013 21:57:38
8     BMC      2  0x20526EDDD3020090 FFFF000943090300 POWER_UNIT_DISABLED
                                                      28 Oct 2013 21:57:39
9     BMC      2  0x20000000040200A0 2450A17000120300 BMC_INITIALIZING
                                                                  00:00:04
10    BMC      2  0x20000000050200B0 1F81A37000120300 CHASSIS_CONTROL_REQUEST
                                                                  00:00:05
11    BMC      2  0x20000000060200C0 FFFF006FFA220300 ACPI_ON
                                                                  00:00:06
12    MP   0   2  0x5E800A7A00E000D0 0000000000000000 MP_SELFTEST_RESULT
                                                                  00:00:07
13    BMC      2  0x20000000080200F0 FFFF027000120300 SOFT_RESET
                                                                  00:00:08
14    BMC     *5  0x2000000009020100 FFFF03076F020300 VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
                                                                  00:00:09
15    BMC     *7  0x2000000009020110 6F00A8706F120300 SHUTDOWN_OR_RESET_ON_SENSOR
                                                                  00:00:09
16    BMC      2  0x2000000009020120 6F00A3706F120300 CHASSIS_CONTROL_REQUEST
                                                                  00:00:09
17    BMC      2  0x20526FBAD6020130 FFFF0103FCC00300 TIME_SET
                                                      29 Oct 2013 13:40:38
18    BMC      2  0x20526FBAD6020140 FFFF000943090300 POWER_UNIT_DISABLED
                                                      29 Oct 2013 13:40:38
19    BMC      2  0x20526FBAFB020150 FFFF027000120300 SOFT_RESET
                                                      29 Oct 2013 13:41:15
20    BMC      2  0x20526FBAFB020160 FFFF006F04140300 POWER_BUTTON_PRESSED
                                                      29 Oct 2013 13:41:15
21    BMC      2  0x20526FBAFB020170 0401A37004120300 CHASSIS_CONTROL_REQUEST
                                                      29 Oct 2013 13:41:15
22    BMC      2  0x20526FBB03020180 FFFF027000120300 SOFT_RESET
                                                      29 Oct 2013 13:41:23
23    BMC      2  0x20526FBB03020190 FFFF010943090300 POWER_UNIT_ENABLED
                                                      29 Oct 2013 13:41:23
24    BMC     *5  0x20526FBB030201A0 FFFF03076F020300 VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
                                                      29 Oct 2013 13:41:23
25    BMC     *7  0x20526FBB030201B0 6F00A8706F120300 SHUTDOWN_OR_RESET_ON_SENSOR
                                                      29 Oct 2013 13:41:23
26    BMC      2  0x20526FBB030201C0 6F00A3706F120300 CHASSIS_CONTROL_REQUEST
                                                      29 Oct 2013 13:41:23
27    BMC      2  0x20526FBB050201D0 FFFF000943090300 POWER_UNIT_DISABLED
                                                      29 Oct 2013 13:41:25

   -> This is the last entry in the selected log.

 

And current firmware:

 

SYSREV


Current firmware revisions



 MP FW     : F.02.23

 BMC FW    : 05.24

 EFI FW    : ROM A 07.14, ROM B 07.14

 System FW : ROM A 04.11, ROM B 04.11, Boot ROM A

 PDH FW    : 50.07

 UCIO FW   : 03.0b

 PRS FW    : 00.08 UpSeqRev: 02, DownSeqRev: 01

 How would I go about deciphering the voltage degrades/shutdown events to determine what sensor is causing the issues?  I know from reading through this thread as well as searching online that there are a number of components that could be generating this error.  I'd like to cut down on replacing components that don't need to be replaced if at all possible.

 

Thanks,

Holger G
HPE Pro

Re: rx2660 halted by itself

Hi,

your power issue is on sensor 0x6F, which is for a voltage used on the Front Panel and is generated on the system board.

You may even check the MP logs in text mode to get more information than in hex mode.

 

 

Kind Regards

Holger

Was this post useful? - You may click the KUDOS! star to say thank you.

- I work for Hewlett Packard Enterprise -
DougK
Visitor

Re: rx2660 halted by itself

Thanks!

 

I wasn't aware of being able to get more information when I connected over the serial port.  

 

So on top of helping me, you've taught me something new today!