Operating System - HP-UX
1844099 Members
2393 Online
110227 Solutions
New Discussion

Recovering from disk failure - looking for full filesystem check

 
Steven Hargus_3
Advisor

Recovering from disk failure - looking for full filesystem check

All -
We are currently in the process of recovering from a widespread disk system failure, and we need to do a filesystem check on all our volumes to verify integrity. Is a 'fsck -o full' a good check? It seems to go rather quickly, so it does not appear to check the entire disk. Any ideas would be appreciated.
Thanks,
Steven

6 REPLIES 6
A. Clay Stephenson
Acclaimed Contributor

Re: Recovering from disk failure - looking for full filesystem check

That's the whole point of vxfs filesystems -- they are journaled so that even with a hard crash, a log replay is typically all that is needed. If you want to do a complete check and ignore the log then use the -o full,nolog options but that is seldom needed.
If it ain't broke, I can fix that.
Sanjay_6
Honored Contributor

Re: Recovering from disk failure - looking for full filesystem check

Hi Steven,

Using the -o full,nolog will make sure the fsck checks everything and that the filesystem is consistent.

fsck -F vxfs -o full,nolog /dev/vg_name/rlv_name

Hope this helps.

regds
Steven Hargus_3
Advisor

Re: Recovering from disk failure - looking for full filesystem check

I think that may be what I am looking for. Since we may have corruption within the filesystem, I do not think that simply playing back the log will correct or even detect it.

I will try doing an fsck -o full,nolog on the volumes.

By the way, what difference does it make using raw versus block devices during the fsck?

Thanks,
Steven

A. Clay Stephenson
Acclaimed Contributor

Re: Recovering from disk failure - looking for full filesystem check

Raw (character) devices completely bypass the buffer cache unlike the block devices. The raw devices will also be faster for this usage.
If it ain't broke, I can fix that.
Bill Hassell
Honored Contributor

Re: Recovering from disk failure - looking for full filesystem check

The ONLY thing that fsck checks is the directory structure and freespace tables. It never checks the data in files! The VxFS filesystem makes this test very quick as you've seen, but fsck not doesn't check the data areas, it cannot fix bad spots on the disk. fsck just makes sense out of the directory when it might be partially corrupted. When fsck finishes, you can be assured that the directory entries are accurate. However, the entries may point to a bad spot on the disk.

The only way to see if a disk is completely readable is to use dd. dd will bypass all the LVM structures and simply read each track on the disk. dd's default blocksize is 512 bytes, WAY TOO SMALL for checking a disk. Always use something like bs=64k or bs=128k for maximum performance, something like this:

dd if=/dev/rdsk/c12t6d0 of=/dev/null bs=128k

If there is a bad spot that can't be read, you'll get an I/O error or errno 5 message. Disk mirroring is mandatory for production systems, whether done inside a smart array controller or in the OS such as Mirror/UX.


Bill Hassell, sysadmin
Steven Hargus_3
Advisor

Re: Recovering from disk failure - looking for full filesystem check

Bill -
That's exactly the problem we are facing. We have had a failure in the array. Basically, a disk failed in a RAID-5 configuration, then a second disk failed before the first was replaced (matter of hours). Unfortunately, the second disk failure was not a hard failure, and corruption has occurred. A fsck -o full only checked out the directory structures, as you pointed out, and thus does not uncover the corruption in the data.