ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL380 G5 Random Lockups

Aaron P. Perrault
Occasional Contributor

DL380 G5 Random Lockups

Hi all,
I have a situation. I have 3 separate DL-380 G5 computers, all running Windows 2003 R2, SP2 that are suffering the same problem. They randomly just lock up. If you ping them, they respond, but when you try and logon to them, you get just a gray screen. The only way i have been able to get the system operational again is to Hard Power the systems.

These systems all have the most recent PSP (7.91) and firmware updates. The Operating system is fully patched, and showing no problems in the event log when it comes back up.

One of these systems, i have rebuilt, since i thought it was some operating system corruption, but it is happening again. All of these systems have been built with their own legal copies of Windows 2003 that shipped with the systems.

These are all critical systems, one is Exchange, one is a Database server, and the other is my Citrix server. Just from previous experience, I have a feeling it is a driver, or firmware problem, but i have no idea where to look.

Has anyone else seen this problem? Thoughts on fixing it.

Thanks much for the help.

app
10 REPLIES
KarloChacon
Honored Contributor

Re: DL380 G5 Random Lockups

hi Aaron weird situation

so just let double check

"
They randomly just lock up. If you ping them, they respond, but when you try and logon to them, you get just a gray screen."

when you mean log on you mean a remote connection to the server or locally in the server???

any other scenario that the server locks up??
so the same situation in all 3 servers? right?
all of them have same configuration? I mean same Procs, memory, controllers, ...?

regards
Didn't your momma teach you to say thanks!
KarloChacon
Honored Contributor

Re: DL380 G5 Random Lockups

hi again just in case

check this

try disable chimney
netsh int ip set chimney disable

try disable TOE (TCP/IP Offload Engine)
in the nic cards in NCU and NIC teaming in case
nic teaming

try disable RSS (Receive-side scaling)
in the nic cards in NCU and NIC teaming in case there is the in the nic teaming

regards
Didn't your momma teach you to say thanks!
Brian_Murdoch
Honored Contributor

Re: DL380 G5 Random Lockups

Hi Aaron,

When you have this lockup situation can you still access the ILO to control the console and is the console display normal?

Regards,

Brian
Peter Turek
Frequent Advisor

Re: DL380 G5 Random Lockups

What do the event logs say, if anything? Any paged pool or non-paged pool shortages? Do you log on by console and see the hung gray screen? (message boxes that might point to an error).

I'd recommend trying out Sysinternals\MS process monitor, MS Poolmon and take a look at the number of handle in use/free handles, and the paged and non-paged memory used and available.

There's a known issue with remote registry service having a leak (unacknowledged by MS for server 2003 SP2 but I have a bunch of servers with it) where if you remotely poll a server frequently with something like "what's up" software, it will take a handle each poll cycle. So you can hit 1,000,000 handles if you poll often enough and that might hang the server.

It's unlikely it's hardware but maybe run a memory test or perhaps swap some memory. HP diagnostics in general wouldn't be a bad idea.

Good Luck

Pete
Aaron P. Perrault
Occasional Contributor

Re: DL380 G5 Random Lockups

Hi all,
Sorry it took me so long to get back to you guys. I just did the recommendations that Karlo put out there. I will see if that helps anything. Usually it will lock up within a couple of days, so i should know by Friday or so if this helped anything.

I have run the system diagnostics and not had any errors reported.

These systems are all similar in that they are DL-380G5, but that is for the most part where the similarities end. The mail server is a dual-core machine with 2GB of RAM, and the SQL and Citrix servers are both multi-socket quad cores with 8 and 16GB of RAM. They were ordered at different times about a month apart, but all directly from HP. None of our other G5s are having these problems that i know of, but these are the ones that are getting hit the most.

I am going to keep poking around and see if anything comes up.

I will let y'all know

app
RJPierce
Occasional Visitor

Re: DL380 G5 Random Lockups

I have a 380 G5 that has been locking up randomly too. All hardware/software drivers are current. The networking changes (TOE, teaming, RSS) have been disabled. Eventlogs never show anything. I was once editing a file with notepad and when I saved it the thing locked up. I'm sure it has something to do with the controller (E200) because it locks up during disk access. Does anyone have anything else to try?
Phil.Howell
Honored Contributor

Re: DL380 G5 Random Lockups

I too would be interested in a solution to this one.
I have 2 DL380 G5 machines running Windows Server 2003 Standard Edition SP2 (5.2.3790) that both show this problem.
I have tried the hotfix in KB942880 on one server but it hung again after about 30 mins
All firmware is up to date and I will update from the latest PSP next.
In this thread it mentions disabling ASR in the bios and a fix in PSP 8.0
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1135440
Phil
Phil.Howell
Honored Contributor

Re: DL380 G5 Random Lockups

additional info...
After disabling the ASR setting and installing PSP 7.91 the problem is still occurring.
Phil
RJPierce
Occasional Visitor

Re: DL380 G5 Random Lockups

We have a total of 6 dimms (4 X 512, 2 X 1GB)in our server. So far, pulling out the 2 1GB dimms has solved our lockups.
RJPierce
Occasional Visitor

Re: DL380 G5 Random Lockups

We've eliminated all memory modules as potentially being defective. Now we're replacing drives one a time after each lock up. We had a problem on a different G5 with a 146 GB drive that was blinking differently than it's mirror. Forcing out the bad drive solved that problem but this server has us stumped.