ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL100 Storage Server 1TB: RAID5 data volume inaccessible

it61066
Frequent Advisor

DL100 Storage Server 1TB: RAID5 data volume inaccessible

When checking in Adaptec storage manager, we found that port0 failed (see photos) while disk LEDs were all in a normal state (no amber, no red). At this moment, access to RAID5 was normal (slow disk access which is normal) and a backup has been performed by the customer. Since the server was under warranty and a replacement SATA disk (250GB) was not available in our stock, even in HP was also unavailable (!!), we have performed a rescan on the controller and the disk returned to optimal state with a warning state of rebuilding. After that, the data volume is no more accessible (unknown state). Also, after restarting the server and doing ctrl-A then checking the data volume (RAID5) properties, disk0 and disk3 states in that volume were off while disk1-2 were on. If checking both RAID1 status of C and D, all disks were on. Opening the server cover to check for cables problem (as per HP service note) the disk-3 sata connector was loosen from one side of the connector (remember that port0 was showing the problem in Adaptec manager !!) while other ports were seated correctly, Eventhough, all connectors have been removed-reinserted and the tie cutted, RAID5 deleted - recreated. My problem is: the customer is not convinced by the connectors issue that might caused the data loss and needs an HP explanation. Any explanation of what may caused the data loss please?
HPE ASE Proliant & Storage Certified Engineer
Clustering & Virtualization Support (Microsoft & Vmware)

2 REPLIES
Brian_Murdoch
Honored Contributor

Re: DL100 Storage Server 1TB: RAID5 data volume inaccessible

Mohammad,

The attached image shows how the DL100 Storage Server (Formerly known as NAS1500S) is configured with a 4 drive setup.

SATA ports 0 and 1 make up a 9Gb RAID1 -C:
SATA ports 2 and 3 make up a 9Gb RAID1 -D:
SATA ports 0,1,2,3 make up a XXGB RAID5-F:

With the above configuration you can tolerate the port 0 failure as the first mirror (0,1) will be degraded due to port 0 but the second mirror (2,3) will be OK. The raid 5(0,1,2,3) will also be degraded due to port 0. (This matches your image).

If you then lose port 3 due to a cable fault you will have the following situation.
First Mirror (0,1) degraded due to port 0.
Second Mirror(2,3) degraded due to port 3.
Raid 5 - Destroyed due to broken 0 and 3.
RAID5 cannot tolerate multiple disk failure.
It's only the RAID 5 which will be damaged permanently.

It's unfortunate but it looks like the cable fault on port 3 showed up at the same time you had the port 0 disk faulty. I hope this explains the loss of the DATA portion a little better.

Brian
it61066
Frequent Advisor

Re: DL100 Storage Server 1TB: RAID5 data volume inaccessible

Thanks Brian to your explanation,

That what i had explained to my customer but what is wondering in the issue, we didn't replaced any failed disk and now the server is working properly. Also, port-0 connector was properly seated (as i saw) but it was showing fail when the problem occured. Why it showed as failed and then returns to operate normally? I am squaring that the problem might arise again!
HPE ASE Proliant & Storage Certified Engineer
Clustering & Virtualization Support (Microsoft & Vmware)