Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
General
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with Linux on Opteron ProLiant-Server

Jürgen Haug_1
Occasional Visitor

Problems with Linux on Opteron ProLiant-Server

We have problems with ProLiant-Servers rebooting every 4 to 6 weeks. There are several DL585-G5 installed and ALL of them do that (but not all at exactly the same time, just every few weeks they decide to reboot). They're running Linux.

HP can't help, it's at the second level support and they have no clue.

I just wonder if anyone out there has heard of this odd behaviour?
7 REPLIES
Steven E. Protter
Exalted Contributor

Re: Problems with Linux on Opteron ProLiant-Server

Shalom,

This is not normal behavior, nor is it acceptable.

All G5 servers have a built in ROM based utility for testing hardware. I would boot this system into that mode and run a complete and thorough hardware check.

Taking a look at the log /var/log/messages after a failure may give you a clue to what is going on.

What I suspect based on experience:
* Power supply problem
* Problem with the memory

The latter problem will definitely show up in messages file prior to a failure.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Ivan Krastev
Honored Contributor

Re: Problems with Linux on Opteron ProLiant-Server

What is the linux version - check for pathes and updates. Check also if you had PSP installed for the latest version.

regards,
ivan
Ragu_3
Trusted Contributor

Re: Problems with Linux on Opteron ProLiant-Server

>> They're running Linux

Which GNU/Linux? Redhat or CentOS or Debian?
Which port of GNU/Linux? Install and make sure that you run the x86_64 port of Redhat/CentOS or the amd64 port of Debian.
Linux is the kernel, which version?

Debian GNU/Linux for the Enterprise! Ask HP ...
Brendan Peter Murphy
Occasional Advisor

Re: Problems with Linux on Opteron ProLiant-Server

Hi Jurgen,
It would help to have more information!
What Linux Version/Patch level
What software running on these systems
What Proloant Support Pack Version installed
etc

Anyway .. first off, I would recommend ensuring that you have the latest firmware installed on your systems, then upgrade to latest version of PSP.

If the systems are just rebooting without generating crash dumps then I suspect your problem is with the built-in ASR facility. ASR (Automatic Server Recovery) is a feature that causes the system to restart when a catastrophic operating system error occurs, such as a blue screen, ABEND, or panic. A system fail-safe timer, the ASR timer, starts when the System Management driver, also known as the Health Driver, is loaded. When the operating system is functioning properly, the system periodically resets the timer. However, when the operating system fails, the timer expires and restarts the server. The objective of using ASR is to increase server availability by restarting the server within a specified time after a system hang or failure.

The quickest way to discover if the issue is with ASR is to either set the ASR timeout really high or to just disable ASR. This can be done through the RBSU. While Server booting, enter RBSU using F9 key, then select "Server Availability", then select ASR Status to toggle ASR enabled/disabled. Alternatively select ASR Timeout & increase the timeout value to 30 minutes.

If the problem is with ASR, then the system will continue to function as expected.

If not, then the system will either hang or crash, in which case you'll need to gather crash dump info & forward it through your support channels.

Regards,

Brendan
Jürgen Haug_1
Occasional Visitor

Re: Problems with Linux on Opteron ProLiant-Server

Wow, I didn't expect such quick response :-)

I will try to get the people actually involved in the case post some more information on here.

Thanks!

regards,
Jürgen
Ragu_3
Trusted Contributor

Re: Problems with Linux on Opteron ProLiant-Server

>> Wow, I didn't expect such quick response

And have you marked up points for the posts that you got and found useful?
Debian GNU/Linux for the Enterprise! Ask HP ...
Nicolas Petznick
Occasional Visitor

Re: Problems with Linux on Opteron ProLiant-Server

Hello,

we have installed the latest firmware and the latest PSP, available at the time we opened the call.

We don't expect the problem from the ASR!
We disabled the ASR in the Server-Homepage and the server is still freezing from time to time.
The only change is that the machine is not doing an reboot.

Actually we are using SLES 9 SP3. Do you think that upgrading to SLES 10 SP2 would solve the problem from freezing?

Regards,
Nicolas Petznick

TWK