Replace root mirror disk,are these steps all right?

James Lynch · ‎01-22-2004

Clay,

I don't doubt that you have successfully used your procedure for replacing many faulty disks. I would also say that you have been very lucky up to this point to never have seen any corruption.

The problem that I refer to has nothing to do with the tri-state electronics ability to electrically isolate the disk being replaced, but has everything to do with how LVM handles the removal and insertions of disk devices.

Depending upon the current state of the failed/flakey disk when it is removed, LVM may or may not be able to differentiate between the disk that was just removed and the new one that was just inserted. The reason is that LVM, up to this point, has not been told that one of it's disks has just been replaced. The failed/flakey disk could have been in a power failed state, when the new replacement disk is inserted, LVM see its disk at that same H/W path come back from a powerfail. Any I/O that was pending for that disk will now be sent. But wait, this new disk has not yet been preped for use by LVM, that will happen as soon as you run the vgcfgrestore command. So now you have LVM thinking that the pending I/Os were successfully written to the disk, so it in turn updates the extent maps to indicate that the extent is no longer stale. LVM thinks that the extent is now in synch between both disks. This isn't so much of a problem because vgfcgrestore will mark all of the extents as stale on the new replacement disk and they will get resynched by vgsync or lvsync.

You can clearly see that from the time you insert your new disk to the time that you run vgcfgrestore, there is a small period of time where LVM thinks that the new disk is a valid disk to use. The real problem comes into play with any I/Os that read from the affected lvol during this window of vulnerabilty. Because LVM only marks an extent as stale when it can not write it to a disk, there is a high probability that the failed disk's extent map shows the majority of extents as current and not stale. There will probably be only a handful of extents marked stale. Along comes a read I/O. LVM looks at the extent map that is stored in kernel memory to determine where to retrieve the I/O from. Remember that read I/Os are not sent to both mirrors, but are only sent to the least busy mirror. There is a 50% chance that LVM will send the I/O request to the uninitialized replacement disk. LVM gets its I/O request satisfied by reading garbage from the new disk. This garbage data block is used by the filesystem, or application, processed and then most likely written back out to the lvol. Now that you have read garbage in, that same garbage now gets propagated out to both of your mirrors. LVM is happy in that it thinks that the data (LVM doe not care that the data is garbage) was successfully written to both mirrors. Now you have corrupted data on both disks.

The procedures that I described are consistent with HP's documented procedures for replacing a failed mirrored LVM disk.

I hope that this helps to clear up the reason as to why there must not be any active I/Os on the lvols.

JL

Wild turkey surprise? I love wild turkey surprise!

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Replace root mirror disk,are these steps all right?

Re: Replace root mirror disk,are these steps all right?