Operating System - Linux
1755752 Members
4841 Online
108838 Solutions
New Discussion юеВ

Re: Hard Drive failure in array setup - RHEL 4 BL685c

 
SOLVED
Go to solution
Leibniz
Advisor

Hard Drive failure in array setup - RHEL 4 BL685c

Hi Everybody:

We have an RHEL4 system on a BL685c G1 and it has recently had a hard drive failure problem.

The low level RAID configuration is RAID1 and so the OS is only presented with 1 device.

The IML report from the ILO shows only 'POST Error 1789:-Drive not responding,' and subsequent 'POST Error: 1787-Drive Array Operating in Interim Recovery Mode'errors.

Question: with the server up and running, is there any way to tell WHICH drive is faulty?

All the documentation I found appears to assume that the errors show are ones seen during the boot procedure.

I can't seem to find any tools or info indication which slot/drive is the problem (with the server running.)

As this is a physically remote server, I don't have the luxury of physically walking in to the server room and visually inspecting it.

Thanks for the help,
Bill
5 REPLIES 5
Gerardo Arceri
Trusted Contributor
Solution

Re: Hard Drive failure in array setup - RHEL 4 BL685c

Install PSP for RH4 using the hpacucli utility run
hpacucli ctrl all show
(this will tell you the Slot number of your smartarray controller)
hpacucli ctrl slot=X physicaldrive all show
Will tell you for sure which disk is failed, replace X with the slot number from the previous step.
Leibniz
Advisor

Re: Hard Drive failure in array setup - RHEL 4 BL685c

Exactly what I was looking for. Thank you very much. Max points awarded. Have a great day!

Bill
Steven E. Protter
Exalted Contributor

Re: Hard Drive failure in array setup - RHEL 4 BL685c

Shalom Bill,

in the old days, we run dd against every disk device listed in:

fdisk -l

dd if=/dev/sda of=/dev/null count=100000

Then we'd eyeball the system and identify which hard disk light was on.

I think even without looking at the system this method would work as well as the solution above. I'd like your future viewers to see it as an alternative.

Glad your issue was resolved.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Leibniz
Advisor

Re: Hard Drive failure in array setup - RHEL 4 BL685c

Hello:

If I had physical access to the server to watch the blinken lights, I could make use of other options. Even so, just because a disk is showing as failed, doesn't mean that it's disk like will NOT be flashing.

Thanks for the idea though.
Leibniz
Advisor

Re: Hard Drive failure in array setup - RHEL 4 BL685c

I should have added:

How can you possibly even make guess at which disk could be bad, using that method, without physically looking at the disks?

The only certainty would be to make the disks spin (i.e. your dd line) and watch which light didn't come on. If the disk is totally seized, that would work. Otherwise it would still be ambiguous.

And what's more: since we are talking about a low level RAID here, I can't afford to replace the wrong one. I would prefer not to have the bad disk mirror itself over the good one.