System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

SOLVED
Go to solution
CD9
Occasional Contributor

Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

I need to pull my 2x 147g SCSIs, and replace them with 2x 300g ones - with the least downtime possible.

My plan was to "DD" everything to a remote 1TB server, swap in the new drives, "DD" everything back, then "expand" the EXT3 partition to use all the 300g space, and restart my server.

However...

# dd if=/dev/cciss/c0d0 bs=655360 | gzip | ssh root@10.0.0.7 'cat >dd_drive_clone_backup.gz'
Password:
dd: reading `/dev/cciss/c0d0': Input/output error
3387+1 records in
3387+1 records out

The copy aborts on an I/O error. This is inconceivable - the server is running fine, and has 2 drives in RAID0 - it's not possible that both drives have failed on the exact same sectors - what's going on?

How do I get all my data off one or both of these drives?

I did try, via trial-and-error, to located the place after the I/O problems and continue my backup, but when I join together the 2 bits (padded with NULLs for the unreadable stuff), I get an unbootable result.

Any/all advice greatly appreciated!!!

FYI - the "dd" was performed after booting a "live cd" (systemrescuecd) - I was not attempting to "dd" off an "in use" hard drive, but also note: when the RhelAS4u4 O/S *is* booted off that drive - DD of the problem sectors also gives the I/O errors - this appears to be a logical hardware/firmware issue to me?
5 REPLIES
Matti_Kurkela
Honored Contributor
Solution

Re: Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

RAID 0 is *not* a mirror.

RAID 0 joins two or more drives to maximize capacity and performance *at the expense of fault-tolerance*. As the data is striped across the disks, loss of one disk means all data is destroyed. Never use RAID 0 without a good backup strategy, unless the data you're storing is easy to recover from elsewhere.

RAID 0 works by splitting the data into "stripes". In a two-disk RAID 0 system, odd-numbered stripes are stored on one disk and even-numbered on another. A typical stripe size might be something on the order of 4 kilobytes.

RAID 0 tends to magnify the size of disk errors by the amount of disks it has: if one of the disks develops a 10-megabyte failed zone, in a two-disk RAID 0 it means a total of 20 megabytes of data has been corrupted: half of the stripes in that area will be good data from one disk, the rest will be corrupted data from the other disk. If one of the disks is totally lost, _every_ file that is larger than the stripe size is guaranteed to have parts missing.

Worse yet, this shotgun damage effect will increase the possibility that your filesystem metadata will be seriously damaged, making it hard to *find* any files from the disk.

MK
MK
CD9
Occasional Contributor

Re: Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

Sorry - I meant RAID1 (not RAID0) - it *is* definitely mirrored. I've got a paid of 147g SCSIs installed - I see 147g in RAID capacity.
Matti_Kurkela
Honored Contributor

Re: Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

If the RHEL AS 4u4 system has the HP "hpacucli" tool installed, run this command:

hpacucli controller all show config detail

If the hpacucli tool is not installed and you cannot install it, reboot the system and press F8 when the SmartArray controller is spinning up the disks. You'll get to the SmartArray BIOS. Use it to view the state of the physical drives to get more facts about the state of your disks.

The hpacucli tool is available from http://www.hp.com/support on the downloads page for your server model & OS combination.

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=397638&prodNameId=3288140&swEnvOID=2025&swLang=8&mode=2&taskId=135&swItem=MTX-673eb0ec605b4a2bac35ebb18c

MK
MK
Mike Stroyan
Honored Contributor

Re: Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

It seems like the copy would be faster if you asked the RAID controller to do it.

You may be able to just replace one drive and get the controller to rebuild the array using the new larger drive. Then after the rebuild completed you would swap the other drive for a larger one. One more rebuild and you would have two drives with lots of spare space.
I haven't ever attempted that. But the concept is tempting.

The "Configuring Arrays on HP Smart Array Controllers Reference Guide" also has directions on how to split a raid 1 array in offline mode. Then you could replace one disk and expand the array to include the second drive. That seems more conservative, as it turns one drive into a clean backup copy instead of just unplugging it.
CD9
Occasional Contributor

Re: Problem: dd: reading `/dev/cciss/c0d0': Input/output error - on *mirrored* RAID (DL360g4p Linux AS4)

Experimenting on a spare identical machine (sans error of course), I "split" an array, and the controller chose to write 0x1000 block of nulls to the start of all my drives. I did ignore the "warning" about destroying my data though.

I love the auto-rebuild idea - I'll try that today and see how I go!