BladeSystem - General
1753589 Members
6288 Online
108796 Solutions
New Discussion юеВ

Server(s) crash after mem upgrade, varying results

 
Vball23
Advisor

Server(s) crash after mem upgrade, varying results

We have several blade chassis with BL860c servers for hp-ux. Amonth or so ago, I upgraded the memory in 2 of them, adding 16gb, with no problems what-so-ever. 3 weekends ago, I upgraded 4 more BL860c's in the same chassis, again adding 16gb memory. I added the same amount to a blade in a separate chassis. 1 day later, one of the servers crashed and rebooted, and then repeated. There were errors in the server log:

SYSTEM_FIRMWARE_ERROR

MEM_ECC_ERROR_UNCORRECTABLE

Thinking a bad dimm, I removed the new memory and restarted the server. No more problems with that server. The next day, a 2nd blade reacted the same, so I removed the new memory from that server, and since the other 2 in that chassis with new memory were also displaying memory errors, but hadn't crashed. I shut them down and removed the new memory as well. The 5th server, in another chassis, was showing only the common single-bit errors, but I removed that new memory as well. I opened a case with HP and was told that either ALL the new memory was bad and that the current firmware on the blades had a known power distribution issue with memory and either or both could be the culprit.

I was given the 5th server, the one in a different chassis to use as a test. It had firmware older than the other 4. Also, I noticed that the 2 I had upgraded successfully months ago, had NEWER firmware than these. I loaded up the server in the separate chassis with 48GB memory, fully populated and exercised it for days with STM, no problems, no errors other than the single-bit correctable errors I see commonly. I then upgraded this server to the same firmware as the problem servers and exercised for days, still no issues. So, now I'm perplexed and don't know how to proceed.

I will definitely upgrade all blades to the current firmware and I will replace all the new memory from the vendor, but since I could not duplicate the problem with the test server, I have an uneasy feeling about giving this my stamp of approval, especially since these are production servers.

I will add a reply with the firmware versions of each below.

Any ideas?
4 REPLIES 4
Vball23
Advisor

Re: Server(s) crash after mem upgrade, varying results

1st two blades, no issues

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 06.20, ROM B 06.20
System FW : ROM A 03.02, ROM B 03.02, Boot ROM A
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05

4 blades in same chassis as servers above, 2 crashed, 2 were about to crash

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 06.20, ROM B 06.20
System FW : ROM A 03.01, ROM B 03.01, Boot ROM A
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05

1 blade in separate chassis, used to test with no issues, older firmware

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 05.67, ROM B 06.20
System FW : ROM A 01.01, ROM B 03.01, Boot ROM B
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05
Torsten.
Acclaimed Contributor

Re: Server(s) crash after mem upgrade, varying results

The release notes of a later firmware says

Added support for "B" memory DIMMs.

Maybe this is the reason.
Upgrade and test.
What memory was installed (product number)?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Vball23
Advisor

Re: Server(s) crash after mem upgrade, varying results

HP 8GB REG PC2-4200 2x4GB Kit
Part NO. AB566AX
Emy_1
Occasional Contributor

Re: Server(s) crash after mem upgrade, varying results

Brett
Did the upgrade of your firmware resolved the problem? I'm having the same problem with new blades we purchased.