ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL145 G2 Bad fan/boot problems

 

DL145 G2 Bad fan/boot problems

Hi,

I have a DL145 G2, which has always had a problem with CPU FAN 6 - the "Monitoring Sensors" on the Lights Out webage have always reported the fan malfunctioning, with a speed of 0 RPM. If you open the case, the fan is spinning normally. If you swap the connectors for FAN 5 and FAN 6, it continues reporting FAN 6 in error. HP sent me a replacement motherboard about a year ago, and it too reported the same issue with FAN 6. HP pretty much dropped my case after that point, but I wasn't too concerned, as it was obviously just the sensor in error (and with 10 or so fans, losing one shouldn't be a big deal if it were to actually die).

When HP sent the replacement board out, it wasn't just because of the FAN 6 sensor issue - the server shut down, and would not power on. Eventually, it booted again, but after several hours/days (seemingly random), it would shut off again (during this time no load was placed on the server, it wasn't even attached to the network). The HP Case # for that issue is 3215011979 - just in case anyone from HP happens to patrol the forum, and they have access to look it up.

Fast-forward to today, the server shut down with the following (OS is FreeBSD, though I think the messages are generic enough that someone should be able to assist, or at least decipher the meaning):

Jun 7 13:43:56 orion kernel: acpi: suspend request ignored (not ready yet)
Jun 7 13:44:23 orion last message repeated 9 times
Jun 7 13:44:23 orion rc.shutdown: 30 second watchdog timeout expired. Shutdown terminated.
Jun 7 13:44:23 orion init: /bin/sh on /etc/rc.shutdown terminated abnormally, going to single user mode
Jun 7 13:44:23 orion syslogd: exiting on signal 15
Jun 7 15:46:28 orion syslogd: kernel boot file is /boot/kernel/kernel

The 2 hour window between boots you see is because I wasn't able to get the server to come back up - it'd shut off almost immediately after powering it on. If I let it sit for a few minutes, eventually the "Last restart cause" on the Virtual Power panel in the Lights Out admin page said "Watchdog". Eventually, I was able to get it to boot, however it continued to spit out the following error on console:

acpi: suspend request ignored (not ready yet)

I attempted to reboot the server again, however it shutdown and cut the power off, and was again refusing to boot. I determined at this point if I removed power from the server completely, I could get it to boot. So as of now it's back up, but the "suspend request ignored" errors continue.

The only error I see in the Lights Out interface is the FAN 6 error (the same FAN 6 error I've had for a year that's not caused the server to go down OR fail to boot). I've also pasted the System Event Log at the bottom of this message, in case it is of any assistance. Any assistance would be greatly appreciated. Unfortunately, my warranty ended in January - it seems the cost of a new System Board is almost what I paid for the whole server (and replacing the system board last time didn't help much). If it comes down to needing a new system board, I might as well just buy another server - and if I do, you can bet it won't be from HP.

Event Type Date Time Source Description Direction
Generic 06/07/2007 16:16:01 Watchdog Hard Reset Assertion
Generic 06/07/2007 16:16:15 ACPI STATE S0 Power State Assertion
Generic 06/07/2007 16:16:15 ACPI STATE S5 Power State Deassertion
Generic 06/07/2007 16:16:24 ACPI STATE S0 Power State Deassertion
Generic 06/07/2007 16:16:24 ACPI STATE S5 Power State Assertion
Generic 06/07/2007 16:16:43 ACPI STATE S0 Power State Assertion
Generic 06/07/2007 16:16:43 ACPI STATE S5 Power State Deassertion
Generic 06/07/2007 16:16:52 ACPI STATE S0 Power State Deassertion
Generic 06/07/2007 16:16:52 ACPI STATE S5 Power State Assertion
Generic 06/07/2007 16:18:38 POST ERROR System Firmware Error (POST Error) Assertion
Generic 06/07/2007 16:18:40 CPU FAN6 Lower Non-critical-going low Assertion
Generic 06/07/2007 16:18:40 CPU FAN6 Lower Critical-going low Assertion