ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Intermittent Server Hangs - DL360

Robert W. Kramer III
Occasional Visitor

Intermittent Server Hangs - DL360

I have a many DL360 servers. Some run Linux some run W2K. Recently I had a DL360 running Linux start giving me kernel panics. I suspected a drive problem, but wasn't sure. Rather than spending much time, I recreated the Linux server on a different DL360.

Now I am trying to determine what is wrong with this bad DL360 that randomly fails. Sometimes it runs for days - sometimes for only minutes and other times it will not boot at all.

Most of the time when it fails I get a W2K blue screen with stop error 0x0000001E ( 0xC0000006... although the stop error has been different rarely.

I've followed some threads regarding the memory.dmp file, but the disk drive usually has an amber drive array light - so no memory.dmp is written.

So far, I have:
1) Installed NEW RAM.
2) Installed a NEW Integrated Smart Array Controller.
3) Tried other hard drives, but even this one will work fine in another of my DL360s.
4) Swapped the drive to the 2nd bay.
5) Upgraded the P21 System Bios to the 11/15/2002 release.
6) Upgraded the Smart Array Controller firmware from 1.42 to 1.50.
7) Replaced the CD/Floppy module.
8) Removed the 866mhz CPUs one at a time.

None of these made any difference. Temperature is not an issue.

I've ran Compaq Diagnostics. All tests pass in flying colors. If I run it continuously the problem will eventually happen again, but the server generally reboots without providing any kind of error information.

Sometimes right after W2K startup I will get "Unknown Hard Error" dialog boxes.

I've tried 3 different drives in this box. All do the same thing, so I'm just not convinced it is a drive problem.

What can I do to test/troubleshoot this further?

Bob Kramer
22 REPLIES
Steven Clementi
Honored Contributor

Re: Intermittent Server Hangs - DL360

The system board?

The VRM's?


Have you checked the integrated Management Logs for errors?



Steven
Steven Clementi
HP Master ASE, Storage and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5)
RHCE
NPP3 (Nutanix Platform Professional)
Robert W. Kramer III
Occasional Visitor

Re: Intermittent Server Hangs - DL360

Blue Screen Trap (BugCheck, STOP: 0x0000001E (0xC0000006, 0x5FFC0EBC, 0x00000000, 0x5FFC0EBC)) - Operating System

ASR Detected by System ROM

POST Error: 1779-Drive Array Controller Detects Replacement Drives


There are other variations of these STOP errors. 3 or 4 maybe. Unfortunately I've been clearing IML frequently after replacing a piece of hardware. Most of the STOP errors are 0x0000001E/0xC0000006. Some of them say Drive Array Device Failure.

I know the drive/controller is failing... but the problem does not stay with the drive or the controller. Replacing them both with NEW parts doesn't stop the problem... :S

What is a VMR? Also, when you say System Board are you meaning the main motherboard?

Thanks.

Bob Kramer
Steven Clementi
Honored Contributor

Re: Intermittent Server Hangs - DL360

VRM = Voltage Regulator Module? Pretty sure the 360 has them. Usually right next to the CPU's. Is this a 360 G2? or G3?

When I say Systemboard... Yes, I mean the motherboard. However unlikely it might be, it is one of the things you have not swapped out. It would/should be the "last resort" of course, but still a possible suspect.


Steven
Steven Clementi
HP Master ASE, Storage and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5)
RHCE
NPP3 (Nutanix Platform Professional)
Robert W. Kramer III
Occasional Visitor

Re: Intermittent Server Hangs - DL360

This is just a DL360. Dual 866. 512MB. 18.2g Ultra Wide SCSI.
Jonathan Bonney
Occasional Visitor

Re: Intermittent Server Hangs - DL360

Bob, I am very interested in this. I have a DL360 of the same series, and I have started to see the same problems. It took me a week to install Windows Server 2003 to this machine. After replacing these parts below I was able to finish the install but still have problems.

1. Upgraded the bios to the 11/15/2002 release.
2. Upgraded the firmware on the Array Controller to 1.50
3. Swapped the drives to other servers.
4. Installed a new SCSI Backplane.
5. Installed a new Power Supply.
6. Installed a new SPS-Filter.
7. Installed a new 16MB Raid Card.
8. Installed a new System Board.
9. Replaced one SCSI Drive.

Now I see random shutdowns, sometimes hours apart and sometimes days apart. Since 12/7 I have seen 8 Drive Array Device Failure errors.
I have also run the HP/Compaq diagnostic tools multiple and seen no errors. I used a burn in program to throw continuous amounts of information at the processors and it ran literally all day long with no adverse results, but when I tried to test read/write to the drives, system the server locked up.

This is a non-production server, but I'm told that I need to have it ready by the end of the year. Any info that you get on this would be appreciated.
Fred Armantrout
Occasional Visitor

Re: Intermittent Server Hangs - DL360

I have had similar problems with Two DL360's. Both were running Microsoft 2000/2003 or SuSE Linux. The system give no warnnig other than the internal drives start failing. First one then both. Reboot and it will usually boot back up and possiby start rebuilding one drive from another. The more I tinker with it the faster it fails. Had both drives fail almost at once and the console screen blanks but the licensing (FlexLM) was still responding slowly.

Both systems are Single CPU 866's with 512M Ram and 9 or 18 Gig drives.

I have done about the same as you. Swap out the Power Supply, which I can do in about 2 Minutes. Replaced one of the failed drives... Replaced the Integrated controller card. Even PULLED the Intrgrated card completely and it just sees the attached disks as drives on a SCSI channel and it still acted up under Linux. Did not try it under Windows. All the BIOS for everything are up to date.

I don't have a spare CPU Regulator to swap out. Time to check for a spare part online. I have two systems about to heaed for the scrap pile and half a rack of them that I am now leery about. All installed around December 2000.
Robert W. Kramer III
Occasional Visitor

Re: Intermittent Server Hangs - DL360

I finally broke down and bought 3 new motherboards.

The old memory, integrated controller, hard drives, CPUs, etc. were all used with the new motherboards.

These boxes have not failed since the new MBs were installed.

While it 'fixed' the problems I was having I am not convinced that the old motherboards are defective.

I just don't know what the problem was/is but it really bothers me because I'm sure these 3 problematic motherboards are basically ok. I suspect that have some sort of conflict with drivers, firmware, etc. but I just can't put my finger on it.

Also, on the newly installed motherboards I upgraded the firmware and drivers in the same way as I did the ones that began filing.

The problematic motherboards always reported "Unknown Hardware Error" or some other problem related to disk controller or hard drive failure. Yet I am using the same hard drives and disk controllers in these new motherboards and there is no problem now for weeks.

*sigh*

I wish I had an answer for you...

Bob Kramer
CIS Internet Services
Jonathan Bonney
Occasional Visitor

Re: Intermittent Server Hangs - DL360

I was lucky enough to have another server of the same model come available. I ended up swapping the drives out. The old drives are working great in the "new" server.
I still have the problem with the previous one though...and I'm supposed to have 15 more on the way. Kind of makes me nervous. I am still interested in a fix; I have to get the first one back into production.
Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

Bob,

I have had 4 servers with the same situation in the last two months. Did you ever get to the bottom of this issue. I have not yet gotten to the system board/motherboard, but that is what HP is recommending because I have tried everything else. I am nervous as hell becase I have 20 of these things running in production and now I feel like I am just sitting here waiting for them all to fail.

I have one theory however. What about the battery? All these systems are about the same age. Besides disk drives, what typically wears out? Batteries.

HP seems in the dark about this situation, but I think that is because the systems are so old and out of warranty that they don't care. Anyone got any ideas on how to get HP to get serious about looking at this situation?
Mpett1
Occasional Advisor

Re: Intermittent Server Hangs - DL360

Yes buy a service contract!!! That is how you will get the support you are asking. Also I think the issue is with the storage driver. Upgrade to the latest version.

Later
Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

Much cheaper to replace the motherboard on older systems than buy service contract. Motherboard replacement resolves this issue. Seems to be a HUGE defect in this model since I have now replaced 6 of them.

LATER MUCH!
Mpett1
Occasional Advisor

Re: Intermittent Server Hangs - DL360

Thats not what you asked!! You asked what it would take to get HP's attention to this issue and the answer is to buy a service contract!! The servers are way out of warranty and that would be the only way. You did do the correct thing and get some of your own parts for replacement. That is what I would of done...
Jon Rowlan_1
Occasional Visitor

Re: Intermittent Server Hangs - DL360

If anyone at Compaq/HP cares...

I used to buy Compaq because of reliability and longevity.

Until I came across the DL360.

had 4 of these now ... replaced all components except the motherboard. (you could be justified in asking why I didn't stop at the second or third I suppose :-) )

There is an inherrant fault/design flaw somewhere and short of buying a very expensive service contract (where you will get provided with a working replacement) you will not get a sensible reply from anyone at HP.

After two failed in a mission crititcal environments I chucked them all under the workbench.

So if anyone is after new component parts for these ... I have loads except of course the motherboards which are (to put it politely) expired.

This problem is not just related to the one model of M. Brd. I have had three different versions and 3 different CPU speed models at that !

the ONLY fix is to replace the motherboard and HP may well have a fixed version. If not then expect the same problems again later on.

jON

Re: Intermittent Server Hangs - DL360

I am having a similar issue. One of our DL360s would lose its drives intermittantly. Reboot and they would occasionally come back on line. Now, the machine will NOT spin up the drives. I've tested with other good drives, they won't spin up either. They get power but never seem to receive a start command.

Have you seen this?
Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

This is the classic symptom. Replace your mohterboard and the problem will magically go away (at least until the new motherboard starts acting up).

Re: Intermittent Server Hangs - DL360

Thanks, I found some other references regarding "newer" DL360 main boards... how can I tell the "rev" number and whether or not a newer rev will be better? Also, any good luck with any suppliers of these boards?

Thanks.
Jon Rowlan_1
Occasional Visitor

Re: Intermittent Server Hangs - DL360

Now details of the revision of a fixed moptherboard would be good !

At least I can be sure that when I buy a board I can be sure its gonna work.

Maybe I can use all those old dl360's I have.

An official word from HP on this would help .... ???

jON
Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

I have had good luck buying used motherboards from serverworlds.com. I don't know anything about revision numbers. I have talked to HP about this on three occassions and they claim there is no problem (ha ha). As someone stated earlier in the thread, if you had a service contract, they would probably take it more seriously but who has a service contract on such an old machine?
Jay23
Occasional Advisor

Re: Intermittent Server Hangs - DL360

anyone come up with a official solution to this, I now have two 360's that are behaving this exact way. If it is the MB does anyone know a good place to get replacments?

Thanks
Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

serverworlds.com has sold me a bunch of refurb motherboards that have worked well.

Re: Intermittent Server Hangs - DL360

Yep... now I have two dead ones. Performance-wise, I'm not sure these are worth investing any money in so they may end up in the trash bin. I have never had any major failure issues with my Proliant series servers with the exception of the DL360s. Not sure I really want these in my environment even with "refurb" boards... I'd be nervously awaiting a failure again down the road.

Tom Parker_1
Advisor

Re: Intermittent Server Hangs - DL360

Since these machines are so old I would not recommend replacing motherboards for a "production" environment. We did this as a quick fix last year and really haven't had a problem but we are in the process of moving these servers from production to development.

We did replace 6 motherboards and I believe that none of them failed again.