Integrity Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

rx2620 POST Failure

 
SOLVED
Go to solution
Automatic IT
Occasional Visitor

rx2620 POST Failure

Some power problem damaged what seems like an entire voltage rail on our rx2620. We replaced the voltage regulators, power supply, and system board, but we are still experiencing a POST failure we don't understand. Our firmware on our new system board as reported by MP:CM > SYSREV is:

Current firmware revisions

MP FW : E.03.15
BMC FW : 01.50
EFI FW : 01.22
System FW : 02.21

Current processor status as reported by MP:CM > SS is:

System Processor Status:

Monarch Processor: 0

Processor 0: Installed and Configured
Processor 1: Installed and Configured

Current power supply status as reported by MP:CM > PS is:

System Power state: On
Temperature : Normal


Power supplies | Fan
# State Type | States
-----------------------------------------------------------
0 Normal Type 0 | Normal
1 Not Installed - | Normal
2 - - | Normal
3 - - | Normal
4 - - | Normal
5 - - | Not Installed
6 - - | Normal
7 - - | Normal

Starting from a soft boot, the Current Boot log shows no warnings or errors:

Log Entry 0: 15 Oct 2008 18:01:37
Alert Level 2: Informational
Keyword: Type-02 140801 1312769
Power button pressed - special function
Logged by: Baseboard Management Controller;
Sensor: Button - Power Button
Data1: Device Inserted/Device Present
0x2048F63001020010 FFFF01080D140300

Log Entry 1: 15 Oct 2008 18:01:38
Alert Level 2: Informational
Keyword: Type-02 140801 1312769
Power button pressed - special function
Logged by: Baseboard Management Controller;
Sensor: Button - Power Button
Data1: Device Inserted/Device Present
0x2048F63002020020 FFFF01080D140300

Log Entry 2: 15 Oct 2008 18:01:38
Alert Level 2: Informational
Keyword: Type-02 127003 1208323
Chassis Control request to BMC via IPMI or sensor
Logged by: Baseboard Management Controller;
Sensor: System Event
Data2: OEM Code2: 0x04
0x2048F63002020030 0401A37000120300

Log Entry 3: 15 Oct 2008 18:01:40
Alert Level 2: Informational
Keyword: Type-02 127002 1208322
Soft Reset
Logged by: Baseboard Management Controller;
Sensor: System Event
0x2048F63004020040 FFFF027000120300

Log Entry 4: 15 Oct 2008 18:01:41
Alert Level 2: Informational
Keyword: Type-02 080901 526593
Power supply turned on
Logged by: Baseboard Management Controller;
Sensor: Power Supply - Pwr Spply 1 Ctrl
Data1: Device Enabled
0x2048F63005020050 FFFF010944080300

Log Entry 5: 15 Oct 2008 18:01:41
Alert Level 2: Informational
Keyword: Type-02 226f00 2256640
ACPI state S0 (on)
Logged by: Baseboard Management Controller;
Sensor: System ACPI Power State - ACPI State
Data1: S0/G0 working
0x2048F63005020060 FFFF006FFA220300

However, the Forward Progress and System Event logs show the following error at the end:

Log Entry 6: 15 Oct 2008 18:03:40
Alert Level 3: Warning
Keyword: Type-02 076f03 487171
No events were received from system firmware
Logged by: Baseboard Management Controller;
Sensor: Processor
Data1: FRB2/Hang in POST failure
0x2048F6307C020070 FFFF036F00070300

A search on Google and here doesn't turn up anything we can point to, unless the firmware on the new system board somehow don't support our current processors. Using MP:CM > DF, we identified the current processors as follows:

FRU Entry # 1 :
FRU NAME: CPU 00 PIROM ID:20

PROCESSOR DATA
S-spec/QDF: SL8CW
Sample/Prod: 01

CORE DATA
Arch Revision : 00
Core Family : 1F
Core Model : 02
Core Stepping : 02
Max Core Freqency : 1600 MHZ
Max SysBus Freqency : 200 MHZ
Core Voltage : 1275 mV
Core Voltage Tolerance,High : 19 mV
Core Voltage Tolerance,Low : 19 mV

CACHE DATA
Cache Size : 3072 KB

PACKAGE DATA
Package Revision : INT3
Substrate Revision: 00

PROC PART NUMBER DATA
Part Number : 80543KC
Electronic Signature : 0000BC29D8F51374

THERMAL REF DATA
Upper Temp Ref : 107 C
Calibr Offset : 0 C

FEATURES DATA
IA-32 Proc Core Feature Flags: FFFB8743
IA-64 Proc Core Feature Flags: 1B81806300000000
Package Feature Flags : 0B000000
Devices on TAP Chain : 1

FRU Entry # 2 :
FRU NAME: CPU 01 PIROM ID:21

PROCESSOR DATA
S-spec/QDF: SL8CW
Sample/Prod: 01

CORE DATA
Arch Revision : 00
Core Family : 1F
Core Model : 02
Core Stepping : 02
Max Core Freqency : 1600 MHZ
Max SysBus Freqency : 200 MHZ
Core Voltage : 1275 mV
Core Voltage Tolerance,High : 19 mV
Core Voltage Tolerance,Low : 19 mV

CACHE DATA
Cache Size : 3072 KB

PACKAGE DATA
Package Revision : INT3
Substrate Revision: 00

PROC PART NUMBER DATA
Part Number : 80543KC
Electronic Signature : 000285256CACA5E9

THERMAL REF DATA
Upper Temp Ref : 107 C
Calibr Offset : 0 C

FEATURES DATA
IA-32 Proc Core Feature Flags: FFFB8743
IA-64 Proc Core Feature Flags: 1B81806300000000
Package Feature Flags : 0B000000
Devices on TAP Chain : 1

Some searching led us to this page in the HP User Service Guide for the rp4440:

http://docs.hp.com/en/A9950-96011/ch05s02.html

This indicated that it might be one of the processors. So we removed CPU1, and its associated voltage regulator, leaving us with just CPU0. We now end up with the following events in the System Events log:

Log Entry 0: 15 Oct 2008 18:37:24
Alert Level 2: Informational
Keyword: Type-02 140801 1312769
Power button pressed - special function
Logged by: Baseboard Management Controller;
Sensor: Button - Power Button
Data1: Device Inserted/Device Present
0x2048F63864020010 FFFF01080D140300

Log Entry 1: 15 Oct 2008 18:37:24
Alert Level 2: Informational
Keyword: Type-02 127003 1208323
Chassis Control request to BMC via IPMI or sensor
Logged by: Baseboard Management Controller;
Sensor: System Event
Data2: OEM Code2: 0x04
0x2048F63864020020 0401A37000120300

Log Entry 2: 15 Oct 2008 18:37:26
Alert Level 2: Informational
Keyword: Type-02 080901 526593
Power supply turned on
Logged by: Baseboard Management Controller;
Sensor: Power Supply - Pwr Spply 1 Ctrl
Data1: Device Enabled
0x2048F63866020030 FFFF010944080300

Log Entry 3: 15 Oct 2008 18:37:27
Alert Level 2: Informational
Keyword: Type-02 226f00 2256640
ACPI state S0 (on)
Logged by: Baseboard Management Controller;
Sensor: System ACPI Power State - ACPI State
Data1: S0/G0 working
0x2048F63867020040 FFFF006FFA220300


-> This is the last entry in the selected log.

Now we're truly stumped at this point, because as far as we can tell, there is no reason for the system to refuse to POST from this point, as there are no events other than Informational level events, and the front panel System LED is flashing green. The console via MP> CO doesn't show anything, nor does the monitor display anything. What are we missing?
8 REPLIES 8
Sameer_Nirmal
Honored Contributor

Re: rx2620 POST Failure

Are you sure the server is rx2620?

The SFW 02.21 version goes into rx2600. The SFW/BMC versions are qualified to work with MP firmware revision E.02.23. Refer the link
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=447335&swItem=ns-13729-2&prodNameId=88839&swEnvOID=54&swLang=13&taskId=135&mode=4&idx=0

Since it seems that the system is rx2600, the current CPUs ( 1.6GHz/3MB ) are not supported with it.
http://h18004.www1.hp.com/products/quickspecs/11716_div/11716_div.html




Automatic IT
Occasional Visitor

Re: rx2620 POST Failure

Sameer, the slide-out information card on the front panel says:
hp Integrity
rx2620

The model number on the card is listed as AB332A.

This led me to wonder if HP sent us an rx2600 system board, so I poked around in MP:CM > DF to find:

FRU Entry # 11 :
FRU NAME: Motherboard ID:00

CHASSIS INFO:
Type:Rack Mount Chassis

BOARD INFO:
Mfg Date/Time : 0
Manufacturer : hp
Product Name : workstation zx6000 system board
S/N : TH9825E00G
Part Number : A7231-66510
Fru File ID : 11
Custom Info : D
Custom Info : 4721
Custom Info : D
Custom Info : 1701

PRODUCT INFO:
Manufacturer : hp
Product Name : workstation zx6000
FRU File ID : 11
Custom Info : 101

This might not look good, but this page explains how the zx6000 workstation in a rack configuration is an rx2600 series system:

http://www.openpa.net/systems/hp_rx2600_rx2620.html

Unless an rx2620 should return an FRU Product Name of something similar, we have the correct system board. However, if it should return something like "server rx2620 system board", then I need to get a system board swapped out with a correct system board.
Sameer_Nirmal
Honored Contributor
Solution

Re: rx2620 POST Failure

With the mention of the model number AB332A and CPUs, it's indeed a rx2620 system.

You got a wrong system board. A7231-66510 which can only be used on zx2000,zx6000 and rx2600 systems and not on rx2620.

Replace the system board with supported one as listed at http://partsurfer.hp.com/cgi-bin/spi/main?sel_flg=partlist&model=RX2620&HP_model=AB332A&modname=Integrity+rx2620-One+1.6GHz%2F3MB+200MHz+FSB+%28AB332A%29&template=main&plist_sval=ALL&plist_styp=flag&dealer_id=&callingsite=&strsrch=&keysel=%3F&catsel=%3F


The Product name should be "server rx2620" with Part/Model should "AB332A"


Automatic IT
Occasional Visitor

Re: rx2620 POST Failure

Thank you Sameer for the confirmation that we received the wrong board. Our HP support engineer confirms what you posted, and we opened a hardware ticket to try to get help to get this incorrect system board swapped out with the correct system board.
Highlighted
Torsten.
Acclaimed Contributor

Re: rx2620 POST Failure

Currently you don't even have a rx2600, but the "smaller" workstation:

BOARD INFO:
Mfg Date/Time : 0
Manufacturer : hp
Product Name : workstation zx6000 system board
S/N : TH9825E00G
Part Number : A7231-66510
Fru File ID : 11
Custom Info : D
Custom Info : 4721
Custom Info : D
Custom Info : 1701


Problem is the system did not copy the direct values to the new board.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Automatic IT
Occasional Visitor

Re: rx2620 POST Failure

Thanks for the feedback Torsten. There seems to be some confusion at HP that they are trying to straighten out. The engineer who picked up the new ticket we filed says there is some switch that needs to be flipped by an on-site CE, after which the system board will turn into an rx2620. Perhaps this is the copying of direct values to the board that Torsten mentioned. I am a little rattled, as web pages like this:

http://64.223.189.234/node/9

seem to indicate that while the physical form factor between the zx6000 and rx2620 are the same, the system board is completely different. That referenced page says the zx2 chipset is on the rx2620, while the zx1 chipset is used on the zx6000. I found another page that says the rx2620 uses PCI-X, while the zx6000 uses the old PCI standard.

The CE we are working with doesn't even know about the switch or copying values around, is researching that, and will get back to us as soon as he finds the procedure. I know if the ILO board is replaced there is some values to configure, but I haven't been able to locate any instructions that describe a system board-level configuration of values.

Hopefully, this is just a case of HP manufacturing exactly the same system board for a variety of models, and some changes on the board and some EEPROM-type memory configure the system board to behave as a specific model. I'll keep this thread updated on what I find out and what it took to finally replace the system board on our rx2620.
Torsten.
Acclaimed Contributor

Re: rx2620 POST Failure

The board for the zx6000 workstation and rx2600 server is the same (A7231-69512), but the part for the rx2620 is different (AB331-69101 or AB331-69201 / nonRohS or RohS).

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Automatic IT
Occasional Visitor

Re: rx2620 POST Failure

So there was a happy ending for this story. Torsten is correct about the part numbers, and I found out why. HP does indeed manufacture exactly the same board for different models, and then a proprietary procedure is performed in the field by an HP engineer to configure the board for the exact model it is installed into at the customer site.

It turns out that our replacement board was DOA, and that was why we had so much trouble with it. Our HP engineer sent for a replacement of the replacement, came back the next day, installed and configured it, and we were up and running.

Lesson for rx2620 owners: if you have to replace the system board, you can extract the old board to save your HP engineer some time (because it does take about an hour or so if you have never done it before), but when the new board arrives you must call in an HP engineer who has the magic CD that performs the configuration to set the board to the correct model system you own.