ProLiant Servers (ML,DL,SL)
1752812 Members
5640 Online
108789 Solutions
New Discussion юеВ

Re: DL385G1 RAID 10 Array broken

 
J├╡x
Advisor

DL385G1 RAID 10 Array broken

We have an RAID 10 array on a DL385G1. ACU info shows following:

Smart Array 6i in Slot 0

array A (Failed)
physicaldrive 1:2 (port 1:id 2 , Parallel SCSI, 0 Byte, Failed)
physicaldrive 2:0 (port 2:id 0 , Parallel SCSI, 72.8 GB, OK)
physicaldrive 2:1 (port 2:id 1 , Parallel SCSI, 72.8 GB, OK)
physicaldrive 2:3 (port 2:id 3 , Parallel SCSI, 72.8 GB, OK)

unassigned
physicaldrive 2:2 (port 2:id 2 , Parallel SCSI, 73.4 GB, OK)

No drives were moved around nor cables switched for all i know. Prior this system board and controller cache were relpaced because of another issue. After the replacement the RAID Array was OK.

At this point we moved the drive 2:2 to empty slots and got confirmation that all the slots are assinged to port 2. 2:4 and 2:5 are empty at the moment.

I tested on i5 on a different server that one can't add spare drives to a broken array.

Redoing the array is not a good option at the moment. Is there any way to fix the array?
Got a blade?
9 REPLIES 9
TTr
Honored Contributor

Re: DL385G1 RAID 10 Array broken

Did you try pulling out the failed drive and putting the the unassigned drive in its place? It should start rebuilding right away. Take a full backup before you do anything else and do the drive replacement at off-peak usage of the array.
J├╡x
Advisor

Re: DL385G1 RAID 10 Array broken

Hi

As you can see from the report. The replaced disk is at slot 2 as is the failed one. Our problem is that the controller ports differ.

Yes, we have tried to reseat it in to the same slot as it was. But the problem still exists because the whole blackplane is connected to port 2.

AFAIK if the backplane should be on different slots then 0 to 1 should be at port 1 and slots 2 to 5 should be at port 2. Now if port 1 and 2 should be mixed then then the array report should show 2:0, 2;1 and 1:2, 1,3 but instead is shows the array as above. Giving us and odd 1:2 broken and 2:2 unassinged.

If you have questions or dont understand the problem feel free to ask. I hope this text is clear enough.

<<--- Not an naitive english speaker :D
Got a blade?
J├╡x
Advisor

Re: DL385G1 RAID 10 Array broken

Edited a mistake.

AFAIK if the backplane should be on different PORTS then 0 to 1 should be at port 1 and slots 2 to 5 should be at port 2. Now if port 1 and 2 should be mixed then then the array report should show 2:0, 2;1 and 1:2, 1,3 but instead is shows the array as above. Giving us and odd 1:2 broken and 2:2 unassinged.
Got a blade?
TTr
Honored Contributor

Re: DL385G1 RAID 10 Array broken

I understand what you are saying. Based on the internal wiring and the drive slots there should be another drive on port 1.

Look at the quickspecs. Are you refering to the drive slots by the same numbers as in the picture down at the "Storage" section?

http://h18004.www1.hp.com/products/quickspecs/12162_div/12162_div.html#Storage

If yes, you will have to open up the server and see if there is anything wrong with the internal wiring.

The other thing you can try is, if ACU has a drive identification feature for the 6i array, you can turn on the LED light for each drive to see which slot it is.

J├╡x
Advisor

Re: DL385G1 RAID 10 Array broken

Yes, the numbering is identcal for me as is in the manual.

We have a 2nd server with adentical configuration and it also has all the slots assigned to port 2 on the controller.

I cant remember witch mode it was duplex or simplex, but the servers have identical wireing.

I still cant understand why the array thinks that the failed drive is at port 1 when it never was there.

Is it possible to replace drive 1:2 for drive 2:2 in ACU CLI?
Got a blade?
TTr
Honored Contributor

Re: DL385G1 RAID 10 Array broken

> Is it possible to replace drive 1:2 for drive 2:2 in ACU CLI?

You shouldn't have to do any drive role changing in ACU. I am thinking that if you pull the failed drive (and have the unassigned drive pulled as well) and put the unassigned drive in its place it will start rebuilding.

Is the problem that you don't know which slot the failed drive 1:2 is? Can you light up the drive from ACU? If not look at the drives and observe if only one of them does not show any i/o activity.
J├╡x
Advisor

Re: DL385G1 RAID 10 Array broken

The problem is that there is no 1:2 drive. ATM there are 3 drives in slots 0,1 and 3 and they are working. The unassigned drive is at slot 2 (no lights lit). (Slots 4 and 5 remain unused and are confirmed to be 2:4 and 2:5).

All the backplane slots are on port 2 of the controller.

IK it would be an easy fix if I had a slot for 1:2 but I dont and thats the issue ATM.

I tried to replicate the screnario on a testmachine with Smart Array 5i with no avail. The drive 1:2 is what i cant physically access because drive 2:2 is in its place.

Note: all slots (0 to 5) are on port 2 of the Smart Array 6i
Got a blade?
TTr
Honored Contributor

Re: DL385G1 RAID 10 Array broken

> The problem is that there is no 1:2 drive

I have seen this many times with large disk arrays. If you pull a drive out (even an unused one) and put it in another slot, the controller still reads its original location and thinks it is there. The larger arrays have a way to seal with this I am not sure about the 6i.

The CLI has a rescan command that might fix this but as I said before, take a good backup before trying anything at this point. ACU user guide on page 55

http://docs.hp.com/en/9320/acu.pdf

The 6i user guide has some interesting scenarios about replacing hard drives starting on page 23
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c00217854/c00217854.pdf


J├╡x
Advisor

Re: DL385G1 RAID 10 Array broken

Rescan has no effect.

Cant add or replace any drives if an array is broken. Nor is it recommended to move the array to another location.

I guess we have to do it the brual way. Smash everything an rebuild the array from scratch.

That'll be fun - hours for restoration :D
Got a blade?