ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ML370G3 Hanging Problem

Caster Troy
Regular Advisor

ML370G3 Hanging Problem

hi all
i am using a proliant ml370g3 and it is behaving improperly since last few days almost a week. the server hangs without any extra load nothing new is loaded and even then it hangs almost daily once. is there any diagnostic tool that i can run on it and find a possible cause of this error coz in the windows advanced server 2000 event log there is no error message
kindly help
regards
javed khalid
Evil Has Its Winning Ways
18 REPLIES
Rob Pursley
Occasional Visitor

Re: ML370G3 Hanging Problem

I'm having similar issues, unfortunately I don't know the cause of it either. I have a new ML370 G3 with Windows 2003 server installed and periodically it just blue-screens on me for no apparent reason; nothing in the log files nor anything relevant on the error message, just: STOP: c000021a Unknown Hard Error.
Toby Curnew
Occasional Visitor

Re: ML370G3 Hanging Problem

I am using a ML370 G1 with the same issue, this started two days ago, everything stops responding. Only way to get the server to boot is to clear the NVRAM. Any ideas?
Terry Hutchings
Honored Contributor

Re: ML370G3 Hanging Problem

The Smart Start CD which shipped with the server has the server diagnostics available on it. If you boot off the cd, then select server diagnostics off the maintenance tab, it should be able to test the hardware.
The truth is out there, but I forgot the URL..
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

Hi
We are also using Proliant ML370G3, with Red Hat Enterprise Linux 3 (x86).
(The Linux Kernel is 2.4.21-15.0.3.ELsmp on an i686)
We have had the server hanging repeatedly sometimes after 24 hours, other times after a few days.
As Javed and Rob have found, there are no error messages reported or log files relevant to the hang.
We have upgraded the Proliant Support Pack
and the Online Flash Component for Linux, with no joy.
I've now seen Terry's message and have run the Server Diagnostic tests which all passed.
Any further suggestions would be appreciated.
Regards
Simon Black
Caster Troy
Regular Advisor

Re: ML370G3 Hanging Problem

dear all
ml370g3 in my use is connected with a storage of 1230GB and it is assigned a single drive letter and single partition. is there a possibility that the server is hanging due to a large logical volume and should i make 3 or 4 partitions in it, i have also run the latest diagnostic tools but no error found.
kindly help
javed khalid
Evil Has Its Winning Ways
Antonio Luiz de Oliveir
Frequent Advisor

Re: ML370G3 Hanging Problem

Do anyone of you use insight manager?
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

We have used Insight Management and Integrated Lights Out on the servers.
However once the machine hangs we are unable to connect with these and have to physically reboot them.
Once rebooted, the Integrated Management Log has not recorded anything relevant to the hang.

Rob Pursley
Occasional Visitor

Re: ML370G3 Hanging Problem

Yep, mine as well. IM doesn't say a thing about what's been going on.
JohnWRuffo
Honored Contributor

Re: ML370G3 Hanging Problem

Have you all installed the latest bios on these servers? There was a processor and thermal fix to improve the reliablity of the ML370G3 a while back:
http://h18000.www1.hp.com/support/files/server/us/download/21517.html

Let us know what you find? -john
Enjoy!
__________________________________________
Was the post useful? Click on the white KUDOS! Star.

Do you need help with your HP product?
Try this: http://www.hp.com/support/hpgt
Rob Pursley
Occasional Visitor

Re: ML370G3 Hanging Problem

Yep, got all the latest updates including the bios and drivers for all the bits.
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

I have updated the BIOS.
Server has now been running for 24 hours.
I need to wait until it next hangs.
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

Our server has been up for 7 days without 'hanging'.

However, in response to a support request, HP have suggested clearing the CMOS on thhe server:
"Let clear cmos on the server.
Clearing CMOS
Down the server to clear nvram.
Access the cover when the server is powered down. See the cover for the location of the maintenance switch. Switch #6 on the system maintenance switch from a default position of off to on.
Then apply power to the server.
When you see video, then turn the server off.
Place the switch # 6 back to the default position and apply power. Select F9 at post and then save to reconfigure with the correct controller order and exit. "

We have done this today and rebooted.
If this has nay impact, I'll update this thread.
Rob Pursley
Occasional Visitor

Re: ML370G3 Hanging Problem

Good news, at least for my machine. I called last week HP support and they said that it was the back plane for the power supply. According to the tech the old ones didn't have enough juice to handle the redundant fans that were installed. I dropped in the new one and fired up the server and it's not had one issue for almost a week now.

Good call HP; thanks!
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

Hi
One of our ML370s hung last Monday having been up for 32 days.
We updated both servers with RedHat up2date, latest version of PSP and BIOS last Thursday.
They then both hung during the weekend.
We have now been advised to disable Hyperthreading on the processors, so this has been done and the machines rebooted.

Edmund White
Frequent Advisor

Re: ML370G3 Hanging Problem

Which RAID controllers are you running on? The Smart Array 641 had a similar problem for awhile until a firmware upgrade came out.
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

Hi Edmund
Yes our ML370s are running on Smart Array 641 Raid Controllers.
How do we tell which firmware version is running?
What version should we upgrade to and where can it be downloaded?+
Incidentally, our two ML530s installed at the same time, with the same versions of Red Hat Linux, PSP etc have Smart Array 6402 Raid Controllers and these two servers have never exhibited the 'hanging' problem.
Edmund White
Frequent Advisor

Re: ML370G3 Hanging Problem

Right. The problem was related to the SA640, not the 6400 series. What version of Linux are you running on? My reply to a similar issue can be found at:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=645078

->snip<-
I bet it's the firmware on your Smart Array 641 controller....

I recently experienced a problem with the SA641 controller on ML350 and ML370 servers that caused the system load to rise very rapidly (> 40), halting most network services. It appeared as though the controller would shutdown and that processes that depended upon Disk I/O would go into STAT D (uninterruptible sleep), forcing the load up by one unit per process. Programs loaded into memory (the kernel, top, etc.) were unaffected. This always occured after 3-7 days of uptime (usually when physical memory was cached and swapping occurred).

This problem was fixed by replacing the 641 with a 6400 or 5300 series controller... OR downgrading the firmware (to the last revision from 2003) on the 641. The new firmware on the 641 was just released last week, and seems to have corrected the issue.

I spoke with several HP techs, as I have about 100 systems around to country to support. They told me to simply stop selling that raid controller until they released a new firmware. Messy. All of my systems are RedHat 8.0, run the 6.40 hpasm and cmastor drivers and use 5300, 6400 or 64x series raid controllers with custom vanilla 2.4.21 or 2.4.26 kernels. I experienced this in a repeatable fashion on a new ML350, but a coworker had the same issues with RHEL 3.0 on a 641-equipped ML370 and the 7.0 agents.

The bad firmware is the March 2004 Smart Array 1.92A. The good ones seem to be 2.26B or 1.30.

http://h18000.www1.hp.com/support/files/server/us/download/21214.html

So in this case, try downloading the new firmware for the 640 and see if it stops the crashes. To test, you may want to leave a console running top open on the server and watch the load rise. Most services will stop responding after the load hits 40+.
Simon Black
Occasional Advisor

Re: ML370G3 Hanging Problem

Hi Edmund
This is looking promising.
The Smart Array is currently running firmware 1.92.
I'll install 2.26B and see if it sorts it out.
Thanks for your help.
Simon