ProLiant Servers (ML,DL,SL)
1748069 Members
5219 Online
108758 Solutions
New Discussion

Re: Proliant ML350 G6 reboots randomly

 
Al Bracco
Occasional Advisor

Proliant ML350 G6 reboots randomly

This has been happening for several months. We have updated all drivers, agents & firmware to the latest using the HP service pack for Proliant (SPP). There are no errors in the event log preceding the reboot, just an error after the fact stating the server had shutdown unexpectedly. We had HP replace the Motherboard, CPU, RAM (not sure if they replaced all of it) and power supply. Still happening... We disconnected it from the APC 1500 UPS and it has done it again. Note that it doesn't power off, rather it reboots back to the O/S (Windows 2008 R2). can happen anywhere form once to 4 times a week. can happen when users are working or at night/weekend when the office is empty. 

 

Not sure where to go next with this, but it's becoming a serious problem. Anyone have any suggestions?

 

Thanks,

 

Al

8 REPLIES 8
Johan Guldmyr
Honored Contributor

Re: Proliant ML350 G6 reboots randomly

Hello, 

 

I've seen this caused by a gash in a power cable and by the power button not flexing back out after it's pushed in.

 

Run memtest and/or hp diagnostic test - does it crash while running tests?

 

Do you have two PSUs installed? Is it doing the same if you only use one of the slots (so perhaps not the one you're using now).

JohnWRuffo
Honored Contributor

Re: Proliant ML350 G6 reboots randomly

Are there any errors/warnings in the System managment Homepage / IML viewer?  Does the ILo show any errors?

 

Have you updaed the HDD firmware?

 

What is the normal load on the server?  RAM & Processor

Enjoy!
__________________________________________
Was the post useful? Click on the white KUDOS! Star.

Do you need help with your HP product?
Try this: http://www.hp.com/support/hpgt
Al Bracco
Occasional Advisor

Re: Proliant ML350 G6 reboots randomly

More details: up until HP replaced the parts mentioned, there was never an entry in the IML logs, despite the reboots. The server has rebooted twice since they replaced the parts, but now it is referencing an unrecoverable error on one of the dimms as well as "ASR Detected by System ROM" errors. It's possible HP has introduced another problem with a faulty replacement DIMM.

 

EVERYTHING (all firmware, drivers, agents) have been updated via the HP SPP. RAM and processor load are normal.

 

My original problem either still exists or pwrhaps replacing the power supply helped, but with this new error, I don't know until I fix that, I guess...

Al Bracco
Occasional Advisor

Re: Proliant ML350 G6 reboots randomly

@Johan:  More details: up until HP replaces the parts mentioned, there was never an entry in the IML logs, despite the reboots. The server has rebooted twice since tyhey replaced the parts, but now it is referencing an unrecoverable error on one of the dimms followed by an "ASR Detected by System ROM" error. It's possible HP has untroduced another problem with a faulty replacement DIMM.

 

I will check the cable and the power button. HP Insight: diagnostics: everything passes except for the DIMM now having the uncorrectible errors.

 

Because we have introduced a new problem, I guess we don't know if replacing any of the other parts actually resolved the original problem. My primary suspect was always the power supply. (BTW, it only has one P/S - I've thought about adding a second...)

 

JohnWRuffo
Honored Contributor

Re: Proliant ML350 G6 reboots randomly

Unrecoverable error is a sure sign of a bad DIMM.  Is the DIMM matching with its pair (I hope HP sent you the right part # RAM)

 

You should still be able to use that hardware case # to open up another replacement on their dime by stating it was "unresolved" and ongoing...

Enjoy!
__________________________________________
Was the post useful? Click on the white KUDOS! Star.

Do you need help with your HP product?
Try this: http://www.hp.com/support/hpgt
Al Bracco
Occasional Advisor

Re: Proliant ML350 G6 reboots randomly

Yes, I know we can get the DIMM replaced, I just don't know if the original problem is still there or not...

Johan Guldmyr
Honored Contributor

Re: Proliant ML350 G6 reboots randomly

And unless you get rid of the DIMM errors too we may never know :)

 

Especially with such a diffuse problem I'd say it's a good idea to fix the obvious problems first.

Al Bracco
Occasional Advisor

Re: Proliant ML350 G6 reboots randomly

exactly what I am doing. HP is sending a replacement part... Then we can get back to the original problem...

 

thanks for weighing in, everyone...