ProLiant Servers (ML,DL,SL)
1751840 Members
5311 Online
108782 Solutions
New Discussion юеВ

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

 
SOLVED
Go to solution
Gerrit Hannaert
Advisor

Proliant 1600 / multiple sequential disk failures / mirror / Linux

Hi,
We've got an old ProLiant 1600 server configured with a pair of 36GB mirrored RAID disks running Linux.

Linux reported "Non Fatal error on ida/c0d3", and I noticed the amber "fault" light on one of the disks lit.

Since then I've replaced the disk twice with some spares we had lying around, the second one fresh out of packaging. After a number of hours/days both the "new" disks would also indicate a failure.

What else could be the problem? Could there be an internal error threshold counter which is not being reset? Both disks were hot-swapped.

I haven't tried testing the disks separately, so I can't be 100% sure the new disks are OK, though I guess at least the new one should be.

Original disk:
AB0183346B 400739-B21 104660-001 18.2GB, 7200, WU SCSI-3, SCA-2, LVD, 80 Pin, 1.0"

Second disk:
HB01831B95 400739-B21 104660-001 18.2GB, 7200, WU SCSI-3, SCA-2, LVD, 80 Pin, 1.0"

Brand new disk:
BD01864552 142673-B22 152190-001 18.2GB, WU3, 10K RPM, 1" SCA-2 80 Pin
6 REPLIES 6
Gerrit Hannaert
Advisor

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

Forgot an important piece of information:

cat /proc/driver/cpqarray/ida0
ida0: Compaq Smart Array 3200 Controller
Board ID: 0x40320e11
Firmware Revision: 4.32
Controller Sig: 0xa198ad60
Memory Address: 0xc8912f00
I/O Port: 0x3000
IRQ: 15
Logical drives: 4
Highest Logical ID: 3
Physical drives: 4

Current Q depth: 0
Max Q depth since init: 69

Logical Drive Info:
ida/c0d0: blksz=512 nr_blks=40800
ida/c0d1: blksz=512 nr_blks=17723520
ida/c0d2: blksz=512 nr_blks=718080
ida/c0d3: blksz=512 nr_blks=34843200
nr_allocs = 1789426
nr_frees = 1789426

The faulty disk is currently removed from the server.
Nathan Gervais
Trusted Contributor
Solution

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

Two things that come to mind for me here would be the M&P Patch which can sometimes solve these false failure messages on drives that are U2 and below which can be found here : http://h18007.www1.hp.com/support/files/server/us/download/10468.html

Other thing may be the SCSI backplane on that system may have a bad slot/port.

And updateing the firmware on that 3200 to 4.5 may help
http://h18007.www1.hp.com/support/files/storage/us/download/14979.html
Gerrit Hannaert
Advisor

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

I saw the firmware upgrade, I will apply it.

The M&P patch ('resets any false predictive failure indicators and returns the drive from a falsely degraded condition to a normal operating condition') appears to have been pulled from the website, but Google found it. I'll try that too.
Nathan Gervais
Trusted Contributor

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

its usually on the site but i think the drivers section of the hp site is on vacation today :) i keep getting We're sorry pages :)
Gerrit Hannaert
Advisor

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

I applied the new firmware and the patch, and the array is now rebuilding.

Of course it rebuilt last time as well, so we'll only see next week how long it lasts.

cat /proc/driver/cpqarray/ida0
ida0: Compaq Smart Array 3200 Controller
Board ID: 0x40320e11
Firmware Revision: 4.50
Controller Sig: 0xa198ad60
Memory Address: 0xc8912f00
I/O Port: 0x3000
IRQ: 15
Logical drives: 4
Highest Logical ID: 3
Physical drives: 4

Current Q depth: 0
Max Q depth since init: 64

Logical Drive Info:
ida/c0d0: blksz=512 nr_blks=40800
ida/c0d1: blksz=512 nr_blks=17723520
ida/c0d2: blksz=512 nr_blks=718080
ida/c0d3: blksz=512 nr_blks=34843200
nr_allocs = 74277
nr_frees = 74277
Gerrit Hannaert
Advisor

Re: Proliant 1600 / multiple sequential disk failures / mirror / Linux

Seems stable so far with the, hope it stays that way.