General
cancel
Showing results for 
Search instead for 
Did you mean: 

Proliant DL630, Redhat 7.3 continuous crashes

Conrad_4
Occasional Visitor

Proliant DL630, Redhat 7.3 continuous crashes

I've got a DL630 G3, dual Xeon 3.06, 4 GB HP RAM that is continuously crashing/rebooting at no consistent spot.

During the process of troubleshooting, I've upgraded the BIOS ROM to the latest (1/28/04), preformed a fresh Redhat 7.3 (valhalla) install and with HP's latest Linux driver suite (forget the name of the package off-hand). Swapped RAM, and I'm using a new HP labeled 36GB hard drive on the internal SmartArray 5i controller.

On HP's suggestion, I've turned on "Full Table APIC" and disabled hyperthreading.

/var/log/messages shows the following "errors":

1) qlax200: wrong product code
2) kernel: no more mtrrs available
3) WARNING: No sibling found for CPU0 (and CPU1)
4) Can't locate block-major-(105 through ...)

The system runs fine until any load is put on it, suggesting (to me) hardware, but I can't pinpoint it to any particular process, user or application, or piece of hardware. I've gotten around the CD-ROM hanging (hdparm -d0 /dev/cdrom works...).

HP's agents only report *some* error and then reboot the system.

I turn to the forums. Any insight or advice on where to go from here would be greatly appreciated. Hours of on hold and googling have tired me out...

Thanks!

-Conrad
5 REPLIES
Steven E. Protter
Exalted Contributor

Re: Proliant DL630, Redhat 7.3 continuous crashes

your box is certified on that release of Linux.

I'd say based on your result there is a major hardware issue.

Probably cpu or motherboard, maybe bios.

I'd open a hardware call with hp and make sure you have the server support software for that version of Red Hat.

As a side note, the sendmail configuration in that version of Red Hat is pretty easily defeated by spammers using smtp scripting, so pay attention to sendmail security if this box is exposed to the public Internet.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Mark Grant
Honored Contributor

Re: Proliant DL630, Redhat 7.3 continuous crashes

The mtrrs error looks like video type stuff.

There's an interesting document in the kernel sources Documentation/mtrr.txt which might help or you could have a quick look at this

http://www.ussg.iu.edu/hypermail/linux/kernel/0301.3/1033.html

Never preceed any demonstration with anything more predictive than "watch this"
Olivier Drouin
Trusted Contributor

Re: Proliant DL630, Redhat 7.3 continuous crashes

yeah...the mtrrs stuff shouldnt be causing the crash though
Conrad_4
Occasional Visitor

Re: Proliant DL630, Redhat 7.3 continuous crashes

Thanks for the info. I'm on the phone with HP now, trying to get them to ship me a couple of new proc's... ;)
Don_89
Trusted Contributor

Re: Proliant DL630, Redhat 7.3 continuous crashes

Unload the HP management agents if your running version 7 and see what happens.. I've had the exact same problem with servers rebooting for no reason. If you goto the log section of the hpweb agents, you'll probably see some ASR events.. (Automatic System Reboots)..