ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

recurring issues with varius dl380 g4 or g3

 
Tim Nolan
Occasional Visitor

recurring issues with varius dl380 g4 or g3

In the past two weeks, we have had three servers have issues.

1. dl380 G3 5i controller, Novell 5.1, disk 0 and 1 raid 1 (OS mirror) disk 2, 3 and 4 RAID 5 (apps). Q1529A tape drive. Drive 0 showed bad (red icon on drive), we replaced drive with good drive, tries to rebuild and fails (red icon on drive). Try another drive, same result. Replace controller, now drive four and zero has red error icon. system tries to boot but can't. Rebuilt server from scratch.

2. dl380 g4 5i controller, Windows 2003 server, disk 0 and 1 raid 1 (OS mirror) disk 2, 3 and 4 RAID 5(apps). Q1529A tape drive. Drive 0 showed bad, we replaced drive with good drive, tries to rebuild and fails (red icon on drive). Try another drive, same result. Replace controller and put back original drive 0, fail. Try a new drive zero, system is running.

3. dl380 G4 5i controller, Windows 2003 server, disk 0 and 1 raid 1 (OS mirror) disk 2, 3 and 4 RAID 5(apps). Q1529A tape drive. Drive 0 and 4 showed bad (red icon on drive), we replaced drive 4 with good drive, immediate BSOD. Reboot server, system can't find ntoskrnl.exe and hangs. Take out drive 0, replace with good drive, same issue.

So, we have multiple machines with similar if not identical configurations all dying on us in what seems to be the same way.

Our intentions with the 0 and 1 drive mirror was so that in case 0 or 1 goes bad, the machines could still boot. Well, in our case, if 0 dies, we are SOL. Are we mis-configuring our servers so that there is no disk redundancy? What is the best way to ensure if a drive goes bad (especially a boot drive) that we can still work? Am I experiencing bad drives or bad controllers?

Tests on all the bad drives and controllers, above, from the problem servers show that the equipment is ok (running on a test box w/SmartStart.)

Any ideas??
3 REPLIES
Glenn Matthys
Frequent Advisor

Re: recurring issues with varius dl380 g4 or g3

Please state which RAID controller you are using. IIRC DL380 G3 and G4 use SCSI, please make sure all cables are attached properly by reseating them. Make sure your SCSI ID's are set correctly.
gregersenj
Honored Contributor

Re: recurring issues with varius dl380 g4 or g3

There was some issues with bad connectors the G3's.

However improber seating is the most common reason, also check the FW level, get it to latest.

When you have these multible disk failures.
Reseat the drives.
Re-enable the failed logical drives.

Note on improper seating:
It's a common mistake, to close the lever only. After closing the lever, you must push on the drive, to ensure its fully seatet.

Also in your case.
These are old machines, and running Novell.
Are you using the HW monitoring tools(Insight agents/HP SIM)?
If not you might have had disk failures for a long time.
BR
/jag
stevekat
Advisor

Re: recurring issues with varius dl380 g4 or g3

I would not rely solely on the lights on the drive for status. See what the array configuration software says (software running on an operating machine, not the firmware array setup utility.)