ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL360 spontaneously reboots

Hubert_16
Occasional Contributor

DL360 spontaneously reboots

After some hassle* installing SuSE Linux Enterprise Server 8 onto a DL360 G4 and updating it to service pack 4, all seemed well until it was left on overnight for the first time.

The server spontaneously restarted the next morning. The "/var/log/message" log merely notes "syslogd 1.4.1: restart" at the time of the restart (6:48).

I repeated the installation on a second, identical server, and left them both running over the weekend, with SSH and VNC sessions connected to both. "top" was running on both SSH sessions so I could see the final update of what was running if and when they went down.

Both servers spontaneously reboot Monday morning (July 25) at the exact same time, despite the clocks being a few minutes different (e.g. one claimed a reboot at 7:13, the other at 7:04, but when adjusted to the correct time the reboots occurred almost simultaneously at 7:12). The logs also indicate both servers restarted around 12:15a.m. on Saturday the 23rd.

The last "/var/log/message" entry for both servers before the Monday "syslogd 1.4.1: restart" line was a CRON job at 17:30 on July 23, implying that both servers had been down for over 36 hours before they somehow rebooted Monday morning.

Can anyone shed light on this issue? SuSE SLES 8 is "vendor certified and HP supported" for these servers (reference: http://h18004.www1.hp.com/products/servers/linux/hpLinuxcert-novell.html ), yet the installation was a hassle*, and then there's this spontaneous reboot issue.

Since BOTH servers went down and rebooted at almost exactly the same times, it doesn't seem like a hardware or overheating issue. It doesn't seem like an errant CRON job, since the events occurred simultaneously despite the clocks being off by almost 9 minutes. They're being set up on the internal network, so they're not being hacked into from outside (we have no VPN). I can't think of anything on our internal network which could cause a Linux server to break down when they did, either .

Should ACPI or APIC support be disabled during installation (or during normal operation)? There were no apparent issues during installation off the SP4 CD. The updated kernel on both servers is 2.4.21-286-smp.

Thanks in advance for any help,
Hubert



* the diskette drivers for the Smart Array 6i RAID controller seemed to load but failed to find any hard drives to install onto when booting from the SP3 CD, so I couldn't follow the steps HP provided at http://h18004.www1.hp.com/products/servers/linux/installnotes-dl.html#dl360g4_sles8 . I had to boot and install from SP4 install CD first, reboot, then apply SP3 and re-apply SP4 updates without rebooting in between