ProLiant Servers (ML,DL,SL)
1748165 Members
3866 Online
108758 Solutions
New Discussion юеВ

Proliant DL380 G5 Locking Up?

 
CTS-Tech
Occasional Advisor

Proliant DL380 G5 Locking Up?

We have an HP Server that keeps locking up. This is an intermittent problem and happens every 1-3 weeks or so. Users can no longer access shared files or RDP into the Server. Sometimes the console is frozen, other times you can enter username & password but it will hang.

This is a Proliant DL380 G5 with 8GB of Ram, Dual Xeon E5440s, running Server 2008 x64. It is a: AD Server, DNS, DHCP, File Server, Print Server, Backup Exec Server, Hyper-V (Server 2003, 2 GB Ram), Terminal Server and WSUS.

Windows is up to date. I have opened several cases with HP on this issue (they keep creating new ones). HP has had me update the Proliant Support Pack, Drivers, Bios, Firmware etc. I have ran MemTest and the Proliant Diagnostic Tools offline - all pass. I have ran their "HPS" diagnostic report and uploaded that to them - no errors.

There are no events in the Windows Event Logs that indicate a problem - they just stop creating events at the time the Server freezes.

It doesn't happen on a certain day or time.

I'm going to ask them to start replacing hardware at this point.

Any other ideas?
11 REPLIES 11
Mark Matthews
Respected Contributor

Re: Proliant DL380 G5 Locking Up?

Do you get anything in the IML?
Could be a power thing, try using only one PSU if there are 2 installed...
Might be a networking problem, have you tried a different port in your switch or a different switch?
If using teaming, try dissolving the team and recreating it from scratch, or dont use a teamed connection at all...


---------------------------------------------------------------------------------
Please click the white Kudos star to the left if this post is helpful :)
Blazhev_1
Honored Contributor

Re: Proliant DL380 G5 Locking Up?

Can you please attach the HPS report if not too big or FTP if you have uploaded it anywhere already ? - it can be anything, not sure why you are aiming at the hardware really...
CTS-Tech
Occasional Advisor

Re: Proliant DL380 G5 Locking Up?

The only thing in the IML is a POST error for the keyboard not being detected because we use a KVM.

There are 2 Power Supplies hooked up to 2 Smart-UPS 2200s, but we want to stay redundant.

Teaming is enabled - guess we could try dissolving it but we wanted rundundancy there too.

HPS report is over 100 MB. The reason I am leaning towards hardware is we can't find anything software related - no apparant event logs errors, HP says HPS report has no errors, no virus scans or other scheduled software events etc.

It feels like a memory error issue.
Salvatore Guadagno
New Member

Re: Proliant DL380 G5 Locking Up?

Hi, I have the same problem on two different servers: DL380 G5 and G6. The OS is SUSE 10 SP2. Suddenly the server hangs; the ping works instead the ssh and the telnet don't work.
The only way to put the server running after the hangs is to shutdown it!
there are nothing in the SUSE logs (/var/log); the logs are stopped just before the server hangs.
The HP guys have updated the bios, the firmware but I don't think that is enough to resolve the problem!
wobbe
Respected Contributor

Re: Proliant DL380 G5 Locking Up?

CTS-Tech
Occasional Advisor

Re: Proliant DL380 G5 Locking Up?

Windows Updates are set to download but not install.

Ram was originally in slots 1-4, changed them to 1,3,5,7 so Advanced ECC would work - no change.

AV is AVG SBS Edition 8.5, scan runs Sunday at midnight.

Also, the last entries in the Application Event Log are:

Warning: WMI Event ID 5612 - Windows Management Instrumentation has stopped WMIPRVSE.EXE because a quota reached a warning value. Quota: HandleCount Value: 4099 Maximum value: 4096 WMIPRVSE PID: 4908

Error: WMI Event ID 10 - Event filter with query "select * from __instancemodificationevent within 30 where targetinstance isa 'Win32_PerfFormattedData_PerfDisk_LogicalDisk' and targetinstance.PercentFreeSpace < 1 and targetinstance.Name != '_Total'" could not be reactivated in namespace "//./root/CIMV2" because of error 0x800706be. Events cannot be delivered through this filter until the problem is corrected.

I applied Microsoft Hotfix KB954563 and rebooted the Server last night: http://support.microsoft.com/kb/954563.

This warning and error have occurred several times without locking up the Server however...
CTS-Tech
Occasional Advisor

Re: Proliant DL380 G5 Locking Up?

Drive configuration is 3 SAS drives in a RAID5 with a hot spare - slots 1-4.

Did a malware scan with MalwareBytes - none found.

Setup iLO - firmware was 2.0 and is now updated to 2.01

Yes, this is Server 2008 Standard (not R2). I thought it was on SP2 but it is on SP1. I will install SP2 and then look at the Hyper-V hotfixes.


We have Level Platforms Onsite Manager installed on the Server to help determine what is going on. We receeived an alert that the Server was not communicating.

Saturday around 2:37am the Server rebooted itself - looks like a BSOD. Nothing in the Windows Event Logs - they just stop when it happened and start back up as the Server comes back online, and says the shutdown at 2:37am was unexpected.

The HP IML log has 3 errors:
Critical - 2:38am - PCI Bus Error (Slot 0, Bus 0, Device 0, Function 0)
Critical - 2:48am - ASR Detected by System ROM
Caution - 2:49am - POST Error: M
CTS-Tech
Occasional Advisor

Re: Proliant DL380 G5 Locking Up?

HP thinks its an issue with the Memory Controller. They sent out a Tech and installed a new motherboard. We updated the firmware and will wait to see if the issue happens again...
bigiron
New Member

Re: Proliant DL380 G5 Locking Up?

We have the same problem with one DL380 G5 Product ID# 459584-005.

The server locks up every other day. I've installed different OSes:
* Server 2003 R2 (32 & 64 bit)
* Server 2008 R2 (64 bit)
* ESXi 4.1

All hang every other day. HP replaced the drive backplane and motherboard but that didn't fix the problem.

I'm running memtest86+ right now to verify the RAM but if it hang then I can't really tell if it's bad RAM or not. iLO and OS are not providing any useful log.

Right now it's either the RAM, CPU, harddrive, or RAID controller is causing the lock up but HP really can't provide any useful troubleshooting tool.

HP also asked me to use an HP boot CD (I think from smartstart CD) to run diagnostic and generate a report but whatever we got from that didn't help because we're still having the same problem.