Storage Boards Cleanup
To make it easier to find information about HPE Storage products and solutions, we are doing spring cleaning. This includes consolidation of some older boards, and a simpler structure that more accurately reflects how people use HPE Storage.
Disk Arrays
cancel
Showing results for 
Search instead for 
Did you mean: 

Rebuild Error

SOLVED
Go to solution
Andy Zevon
Occasional Advisor

Rebuild Error

Greetings all,

1 of my 2 73GB HD's failed in my tc4100 (I'm running RAID 1, so I was still able to chug along on Netware 6.5). I received the new HD, installed, rebooted the server and entered the HP NetRaid program. After formatting the HD, and attempting the rebuild, I just get an ERROR up top - no other information. I've tried everything - putting it in another slot, etc. It continually says in that slot (or any for that matter) FAILED. It can't be the RAID controller because I'm still up and running on the other HD, right?

Does anyone have any suggestions? I need to get this other HD insatlled and rebuilt ASAP.

Thanks, in advance,

-andy-
13 REPLIES
kris rombauts
Honored Contributor

Re: Rebuild Error

Hi Andy,

first of all, after the initial disk failure, if you use hardware mirroring via the Netraid controller you could have just installed the new disk online and the rebuild would have kicked off automatically, this is the biggest advantage of a H/W raid controller that you require no downtime or reboot whatsoever to replace failed disks, it is all handled in F/W.

Then secondly, their is no reason to low level format a hard drive in Netraid Express Tools, the disks HP provides are ready to be used "as is" and do not require a format before they can be used, not sure why this was done.


Then, to the problem you ran into now,
if you have the Netraid monitor running, check for any messages it provides you.

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?locale=en_US〈=English&pnameOID=62491&prodSeriesId=51930&prodTypeId=329290&basePartNum=COL3579&locBasepartNum=ns-12646-1&os=Novell+NetWare+5.x&tech=Software++Storage+Controller



Other then that and if you tried to rebuild to that disk in different slots i would suspect that new disk to be bad. Do you know if this is a supported, compatible disk ?


HTH

Kris

Andy Zevon
Occasional Advisor

Re: Rebuild Error

Hi Kris - thanks for the response. Few questions...

Did I do any damage by formatting the HD? Not sure why I did that. I insatlled the monitoring NLM. I now moved the HD back into the original (first) slot. The light is green (before it was amber), but not blinking. The rebuild should begin automatically, right? I go into object properties, and select rebuild. It tells me the drive needs to be failed. I switch it to failed, try the rebuild, and it tells me 'The drive does not belong to Redundant Array' - but of course it does!

I did notice that the product ID and revision numbers are different from old HD to the one I have. But the drives have exact part number, model #, etc on the labeling. Does any of that matter?

You did mention that the new HD could be bad - I happened to order 2 and both are doing the same thing. I did format both though... did I mess myself up?

Thanks again!
kris rombauts
Honored Contributor

Re: Rebuild Error

Andy,

- formating should not be an issue but is
not needed.

- try to assign this disk as a hotspare disk
and then start the rebuild.


If it says that the disk is not part of a redundant array, this means the disk is seen as a disk that is part of a raid0 array (raid0 is non redundant).

Wondering now if you are maybe using H/W mirroring and not S/W mirroring at the Netware OS level for which you should create two raid0 arrays and mirror them from the OS side of things, this would also explain why the rebuild did not start automatically but it would be a waste of the Netraid controller really so i hope this is not the case.

What is the status of the remaining array then that is now still running but in a degraded status ? This should be your A0 array and one disk A0-0 or A0-1 should still be available, maybe collect a screenshot (via remote console).


Kris


Andy Zevon
Occasional Advisor

Re: Rebuild Error

Kris - I've attached a screen shot as you mentioned. I can't imagaine the NOS doing RAID, but who knows? The server was ordered before my time and I just assumed (as you did), then since there is a HW controller that it was doing the job. Is there any way to tell if the HD's themselves are causing the error (or fail)?
Andy Zevon
Occasional Advisor

Re: Rebuild Error

Forgot to mention - that FAIL in ID 2 never dissapeared when I removed the HD from that slot and put back in 0 Not sure if that helps anymore.
Andy Zevon
Occasional Advisor

Re: Rebuild Error

Hi (again) Kris,

I've been trying all sorts of different things, but also wanted to add this:

Am I supposed to make the new HD (which is being reported as FAIL) ONLINE? Will that in anyway damage or corrupt the data on the good/active HD? And what about initialing? What is that and is it done to logical drives only?

Sorry for all the questions - I've never had to rebuild a RAID before, and I really need to get this thing up and running ASAP.

Thanks for your continuing help Kris.

-andy-
kris rombauts
Honored Contributor

Re: Rebuild Error

Andy,

answers to your questions

- Netware can do software mirroring since
the NW 2.x and 3.x days but here it's
clearly hardware mirroring as expected.

- this is not a problem but it seems like
the original array was build with 2 disks
in slot 1 and 2 (SCSI id=1 and id=2) and
nothing in SCSI id=0, the first slot on
the left).

- the reason why the 'failed' disk in slot
2 never disapears is because the
controller knows that it is missing it's
second disk still and that it was located
their, it's id is A00-0. This is normal
and it will only dissapear when it is
able to succesfully rebuild the array.


- at some stage it seems you created a
second array A01-00 array (raid 0) with a
disk in slot 0 since the screenshot now
shows a second array A01 that exists but
that also failed probably because the
disk has been removed or put as failed
during the tests done, this is why you
got the message :
"The drive does not belong to Redundant
Array" when trying a rebuild.


- do not put the replacement disk ONLINE
manually since this will then let the
array think that both disks are back in
sync (mirrored 100%) and you will have
problems when the OS reads of that 2nd
disk as their is no data, this could
corrupt and crash your system, so don't
do this.


- the initialize is done at the logical
drive level normally at creation time
so don't do this on a existing logical
drive with data or you will wipe it.


- Can you add a disk in a slot other then
0 and 2 (while the server is online) and
launch the Config utility, configure that
new disk as a hotspare disk and then
initiate the rebuild ?


- If not ok, pls also collect the
screenshot like the one you provided
but for every disk again, so we can see
if the capacity is the same ( i.e. 700006
Mbytes).

- try deleting the A01 array since you
don't need this one.



Kris
Andy Zevon
Occasional Advisor

Re: Rebuild Error

Kris,

I've attached all the screen shots you mentioned, and tried your suggestion.

I first inserted the new drive into SCSI 8 and read READY. Then configured as hotspare - it initially read as REBUILD, but then changed to FAIL A00-00 Drive States Changed (took a screenshot).

At this point, I could assume it's the replacement HD, perhaps?

I also tried to delete the array, but can't seem to find where to do that.

-andy-
kris rombauts
Honored Contributor

Re: Rebuild Error

Hi Andy,

i think i have some bad news for you here.
The reason why the rebuild fails is most likely caused by the remaining disk having
media errors as can be seen in the screenshot (26 media errors).

So the rebuild fails because the rebuild process starts but encounters a bad block on the disk and as such cannot recreate the data to be written on the destination disk (new disk).

If this is a critical system it's best to take full backup of the data and the Netware config files and make a new installation, restore data and config files. If the media errors grow, this disk is going to fail one day or the other or you might experience Netware abends.

Kris
Andy Zevon
Occasional Advisor

Re: Rebuild Error

Kris,

Ughh!!! Well, at least there seems to be an explanation. One last question. Are there any utilities that can potentially try to fix the media errors? It's probably not worth it though, correct?
kris rombauts
Honored Contributor

Re: Rebuild Error

Andy,

no, this is a media defect on the disk platter/surface so a physicall issue that cannot reliably be fixed by reformatting the media since it reoccurs most likely and reformmating means reinstalling the operating system anyhow with the risk of having to redo the whole thing again at some later stage which can be days or weeks/months ..no one can predict that.


Kris

ps: this thread warrants me some points i guess , even for the bad news :-)
kris rombauts
Honored Contributor
Solution

Re: Rebuild Error

Andy,

once last thing, i think it also exists for Netware but their is a consistency check that runs every week by default (for Windows)and make sure to install this and that it runs since that will detect issues timely and try to correct it or at least log it so you can replace the disk that has a bad block before the other one in the redundant raid fails and you need to do a rebuild (like now) which can never complete succesfully. In your case it failed almost in the bginning but i've seen this happening at 90% or 99% also ...

good luck

Kris
Andy Zevon
Occasional Advisor

Re: Rebuild Error

Of course! I think I just submitted them - it's my first time on this forum. Did it work?