1832932 Members
2951 Online
110048 Solutions
New Discussion

Re: Read error in RAID 0

 
fawrell
Advisor

Read error in RAID 0

Hi.

I want to ask, if an unrecoverable read error occurs on a hard drive in a RAID 0 array, so piece of data is lost, will the controller change status of the logical drive from 'OK' to something else, or not? Or how the controller will notice the administrator, that such a bad error was occured on a logical drive?

Thanks for answers.
4 REPLIES 4
skris
Trusted Contributor

Re: Read error in RAID 0

The drive status will soon become failed. Please note that it is neither recoverable nor repairable.

Note: Some files/data might be recoverable till the volume state is set to failed. After which the user i/o to that volume will be terminated.
kris rombauts
Honored Contributor

Re: Read error in RAID 0

Fawrell,

as you know, this is a example where RAID protection would have saved your valuable data and valuable time.
The fact a uncorrectable error occured now means a piece of user data is likely lost/damaged and because the issue is a low level (i.e. hardware disk platter) the OS is not able to handle this either, so it is not a issue at the file system that could be fixed with i.e. a chkdsk if we think of a Windows box.
As i explained in another thread before, if this uncorrectable error occured in a space on the disk where there is no user data, there is no issue, it depends how you detected this error and which tool you used to see this error and then i can better judge this issue.

The single disk drive or the logical drive in it's whole in this case because it's RAID0 here will become failed when the threshold of uncorrectable errors is crossed on that one disk drive. This threshold is a value defined by the disk manufacturer and can be different from one disk model to another or from vendor to vendor per the specifications of their disk drive.
So to answer your question, this means that if this uncorrectable is a standalone event and the rest of the media/surface is ok, this drive can run fine for a long long time ....
I have a SCSI drive in a Proliant server with a pre-failure alert (which occurs when the error threshold is crossed) since more then a year and it still runs fine, this is a test system and it is raid protected so i can wait till it fails....


If you are using SmartArray controllers here in your case, the ADU report and OS log files if you use Insight Management agents are the best source to see how bad the disk(s) are at the moment.
Best course of action is to backup data or manually copy the user data now ASAP of course.

HTH

Kris
fawrell
Advisor

Re: Read error in RAID 0

Really thanks for answers.

One question more, just for sure. Will the disk drive with crossed threshold of unrecoverable read errors become failed, or there will be only pre-failure alert on this disk drive?
kris rombauts
Honored Contributor

Re: Read error in RAID 0

When the predefined threshold of a certain error is crossed, a pre-failure alert and associated SNMP trap and eventlog message will be created (assuming you install the insight management agents).
This will not necessarily flag the disk as bad/failed, there are other error conditions that can do this.

So after you received the pre-failure notification, the drive can quit happily continue to work and it can take some more errors and time before the controller flags the disk as bad/failed. Sorry but I don't have a list of the exact conditions that would really flag a disk as bad/failed.


HTH

Kris