BladeSystem - General
1751840 Members
5254 Online
108782 Solutions
New Discussion юеВ

Re: Server(s) crash after mem upgrade, varying results

 
Vball23
Advisor

Server(s) crash after mem upgrade, varying results

We have several blade chassis with BL860c servers for hp-ux. Amonth or so ago, I upgraded the memory in 2 of them, adding 16gb, with no problems what-so-ever. 3 weekends ago, I upgraded 4 more BL860c's in the same chassis, again adding 16gb memory. I added the same amount to a blade in a separate chassis. 1 day later, one of the servers crashed and rebooted, and then repeated. There were errors in the server log:

SYSTEM_FIRMWARE_ERROR

MEM_ECC_ERROR_UNCORRECTABLE

Thinking a bad dimm, I removed the new memory and restarted the server. No more problems with that server. The next day, a 2nd blade reacted the same, so I removed the new memory from that server, and since the other 2 in that chassis with new memory were also displaying memory errors, but hadn't crashed. I shut them down and removed the new memory as well. The 5th server, in another chassis, was showing only the common single-bit errors, but I removed that new memory as well. I opened a case with HP and was told that either ALL the new memory was bad and that the current firmware on the blades had a known power distribution issue with memory and either or both could be the culprit.

I was given the 5th server, the one in a different chassis to use as a test. It had firmware older than the other 4. Also, I noticed that the 2 I had upgraded successfully months ago, had NEWER firmware than these. I loaded up the server in the separate chassis with 48GB memory, fully populated and exercised it for days with STM, no problems, no errors other than the single-bit correctable errors I see commonly. I then upgraded this server to the same firmware as the problem servers and exercised for days, still no issues. So, now I'm perplexed and don't know how to proceed.

I will definitely upgrade all blades to the current firmware and I will replace all the new memory from the vendor, but since I could not duplicate the problem with the test server, I have an uneasy feeling about giving this my stamp of approval, especially since these are production servers.

I will add a reply with the firmware versions of each below.

Any ideas?
4 REPLIES 4
Vball23
Advisor

Re: Server(s) crash after mem upgrade, varying results

1st two blades, no issues

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 06.20, ROM B 06.20
System FW : ROM A 03.02, ROM B 03.02, Boot ROM A
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05

4 blades in same chassis as servers above, 2 crashed, 2 were about to crash

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 06.20, ROM B 06.20
System FW : ROM A 03.01, ROM B 03.01, Boot ROM A
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05

1 blade in separate chassis, used to test with no issues, older firmware

MP FW : T.02.17
BMC FW : 05.20
EFI FW : ROM A 05.67, ROM B 06.20
System FW : ROM A 01.01, ROM B 03.01, Boot ROM B
PDH FW : 50.07
UCIO FW : 03.0b
PRS FW : 00.08 UpSeqRev:02,DownSeqRev: 05
PIC FW : 00.05
Torsten.
Acclaimed Contributor

Re: Server(s) crash after mem upgrade, varying results

The release notes of a later firmware says

Added support for "B" memory DIMMs.

Maybe this is the reason.
Upgrade and test.
What memory was installed (product number)?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Vball23
Advisor

Re: Server(s) crash after mem upgrade, varying results

HP 8GB REG PC2-4200 2x4GB Kit
Part NO. AB566AX
Emy_1
Occasional Contributor

Re: Server(s) crash after mem upgrade, varying results

Brett
Did the upgrade of your firmware resolved the problem? I'm having the same problem with new blades we purchased.