System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

SOLVED
Go to solution
Kennedy G. Doss
Regular Advisor

RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Fellow Linux SAs:

Has anybody had a similar issue with a DL 585 - G5 Server? We have Linux 4.6 running on this server with 4-5 Oracle Databases running on it. The server panics intermittently 5 to 6 times a day. We configured Disk Dump onto one of the local disks. We have 1 Terrabyte of SAN (Clarriion Disks - ATA - 5400 RPM - 300Gig Drives) We have changed several versions of the LPFC Drivers. (The server has emulex HBAs attached) Any clues would be most appreciated. We are working with RedHat Support too on this but so far havent gotten a favorable break through.
10 REPLIES
Steven E. Protter
Exalted Contributor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Shalom,

While not in reboot state.

Check:
dmesg

/var/log/messages

There are probably clues. This behavior can be caused by software or hardware.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Kennedy G. Doss
Regular Advisor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

There is absolutely nothing suspicious reported in dmesg / messages file.
Rob Leadbeater
Honored Contributor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Hi,

Do you have ASR (Automatic System Recovery) enabled on the server ?

If so, try turning it off.

The iLO and IML logs may also show some useful information as to the cause of the reboots. If they are due to ASR, then turning it off will hopefully allow you to diagnose what is happening at the OS level before the reboot...

Hope this helps,

Regards,

Rob
Kennedy G. Doss
Regular Advisor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Rob:

Thanks for the useful info. How do I check if it is enabled? Is it some kind of service like "diskdump" or "netdump" ?? Any input related to the files which can give me this information, would be most appreciated.

Regards,
-Kennedy
Rob Leadbeater
Honored Contributor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Hi,

I think you can do it either via the iLO interface, or via the System Management Homepage.

Cheers,

Rob
Colin Topliss
Esteemed Contributor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

Install the PSP (Proliant Support Pack). I think 8.25 is the lastest (earlier ones are buggy at best).

Once installed (it will require a reboot), try connecting to HPSMH:

https://:2381

Log in with the root username/password.

If you've had an ASR, you should see the 'System' box will be highlighted (yellow I think).

To see if ASRs are enabled, click on the 'Autorecovery' link in the 'Recovery' box.

You can also see the ILOM and IML logs if you wish.

Regards

Colin
Greg Adair
Occasional Visitor
Solution

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

You don't have to install the entire PSP pack. In fact I wouldn't recommend it. There some that install drivers that you'd be better off using the ones that come with RHEL. In the past there were some that tainted the kernel as well. The key rpm you need in this case is hp-health. Once installed you can execute:

hpasmcli -s 'show server'

You can also run hpasmcli in interactive mode. To get a list of all the options just type "help".
Rick Beldin
Esteemed Contributor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

If you are collecting dumps from this system, then you could run crash on it, providing you have the debuginfo rpm for the kernel you are running. You would then the stack trace and other information. This would be useful for searching RH bugzilla. Oracle RAC systems are more difficult because the RAC sw will reboot a node if cluster communications are interrupted for some reason.

With an HP Linux support contract, you could log a call and HP would work with Red Hat to determine root cause.
Necessary questions: Why? What? How? When?
Aryan
Advisor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

hi,

It may happen due to the power problem..
Please check /var/adm/ptydaemonlog

Regards,
Aaryan
loco_vikide
Frequent Advisor

Re: RHEL 4.6 Server Reboots intermittently 5 to 8 times a day.

As Craig Adair suggested, disabled ASR. The ASR feature resets server after a critical hardware or software error or a fault occurs.

To get current ASR setting, use this command:
/sbin/hpasmcli -s "SHOW ASR"

To disable ASR, use this command:
/sbin/hpasmcli -s "DISABLE ASR"

You have to rely on HPSIM Critical error log, system messages log and diskdump core file to determine the root cause.

By not having ASR enabled (default is 10 mins elapsed time for ASR to trigger system reset), you might have a better chance to have the logs and core dumps capture more info on the nature of events that lead to system reset.

Good luck