ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Disk Fails when its not broken

Disk Fails when its not broken

Hi,
I have a strange issue with a DL380 server. At present it had 6 36gb disks in two logical drives, 0-3 in a raid 5 using reiserfs and 4-5 using xfs in a raid 1. I am running Debian Linux which is up to date. Now this has been running fine for months, but just recently its been having some major problems. After doing any reasonable amount of work the system will freeze, disk 1 will then fail, 5 seconds later disk 3 will fail. This makes a bit of a mess of the system as you lose the raid. Now there is absolutely nothing wrong with these disks, if you do a restart and ask the system to try and use them they work fine until 36 hours or so later they fail again. All the other disks are fine and have yet to report any failures. I have replaced the disks with new ones and I get the same issue.

Any ideas on how to fix this?
5 REPLIES
TRS
Frequent Advisor

Re: Disk Fails when its not broken

Dear I per me the problem is with Backplane of your server.
99% U have to replace.Check firmware of all HDD'S IF not up to date u have to update that also.Better u download firmware cd 7.91 as well as smart stat 7.91.update firmware & psp.

Let me know array controller firmware & rev.
Dl380 genaration
Psp ver IF INSTALLED.


Regards,
TRS

Re: Disk Fails when its not broken

Sorry I should have said this with the original post. When the disks "failed". I decided to rebuild the array using smart start ACU. I did a check first and it grizzelled that the firmware on the two new disks was out of date. So I got the firmware update CD and installed the latest firmware (I downloaded it in November when I last expanded it so I am guessing its still current). After that I did another check on the array and it looked to be happy, so then I went on to rebuild the array and start again. In only rebuilt the array that was "faulty" and dident touch the good one (why restore from backup data thats not lost?)

Re: Disk Fails when its not broken

Would part number 228502-001 be the right backplane for a DL380 server?
TRS
Frequent Advisor

Re: Disk Fails when its not broken

Dear,
Tell me the genaraion of ur server.Or talk to hp & take part no.Or open the server & note down from the back plane & order the backplane(If down time available.Check the firmware ver on net Dont think u updated in nov so no need to update.

Re: Disk Fails when its not broken

Hi,

I believe its a G2. Is there any easy way to find out without turning the thing off. As its the only live box I have I dont really like the idea of turning it off any more than absoloutly neccessary. I have removed the disks that were causing the issue, and it seem happy now, but I do still need to fix it as I need the storage space.

Anyway on the front of the box there is a sticker with the following:
DL380 R02 P1400 512 UK
8248JZG22372

One other thing I would like to ask (as I am going to completly rebuild the os when this is sorted) am I likley to see some benifit from using SUSE linux over debian. I have been useing Debian for some time now and am very happy with it. But I have also used quite a bit of SUSE. Basically is there going to be some speed improvement, better driver support for changing fan speeds, better reliablity or pre-warning of failures like the one I have or something of that order which I would get from switching to SUSE? Or some totally amazing cant live with out benifit for switching to red hat (as I havent played with red hat for a very long time)?