Operating System - HP-UX
1752754 Members
4678 Online
108789 Solutions
New Discussion юеВ

vgcfgrestore problem after replacing mirror

 
nancy rippey
Trusted Contributor

vgcfgrestore problem after replacing mirror

I lost a mirrored non root disk. The disk was replaced.
I then ran a vgcfgrestore, vgchange and a vgsync to the volume group. I dit not receive any errors running any of these commands. Somehow two of my filesystems became corrupted. One I was not able to do a fsck -F vxfs -o full -y - it returned a bad magic number. Tried a couple times but no luck. The mirror as also gone. I verified from the previous days data collection that the lv was mirrored but I was unable to get it to mount. Finally I removed the lv, recreated it and had it restored. The other fs contained oracle tables. The tables were there but were corrupt (fsck would not fix). We are now in the process of restoring the db files. I have replaced mirrors probably a dozen times and have never seen file corruption like this.

Any ideas?
nrip
6 REPLIES 6
Patrick Wallek
Honored Contributor

Re: vgcfgrestore problem after replacing mirror

Perhaps the disk that didn't get replaced is also on its way to disk heaven? If you did not receive any errors then everything SHOULD have been OK. But it the disk that you sync'ed the mirrors from is going bad, then that could explain it. I would monitor that disk very very closely.

Juergen Tappe
Valued Contributor

Re: vgcfgrestore problem after replacing mirror

It might be a stupid question, but are you 100% sure that you replaced the bad and not the good disk?

If you still have the old disk you might try to reconnect it to i.e. a testsystem and try a vgimport -q n from it.
Do you see any stale extents.

I guess this doesnt realy help you....

regards
Juergen
Working together
generic_1
Respected Contributor

Re: vgcfgrestore problem after replacing mirror

I would check stm to see if that other disk is bad, and check syslog for other disk errors. Also are you sure you pulled the right disk? Depending how many you had it could be easy to pull the wrong one. The disk could have been going bad for a long time and did in your data I suppose too giving you a copy of the messed up data.
nancy rippey
Trusted Contributor

Re: vgcfgrestore problem after replacing mirror

I currently do not have any stale extents. If the incorrect disk would have pulled I would have lost my entire volume group and that did not happen. Only the corruption on 2 filesystems.
nrip
Mel Burslan
Honored Contributor

Re: vgcfgrestore problem after replacing mirror

Since you mentioned the tables are there but the contents were corrupt, did you consider something on the oracle side may have messed them up.

The mirrors not recovering the lost data tells me eith of the two things: 1) the replaced disk was not the bad one (believe me it happens more frequently than you can think), or 2) Mirror disk was just a step behind the original disk on its way to the graveyard and during the process it also bit the dust.

Also on a very remote possibility, the firmware revisions of the disk devices may be the thing to blame for LVM corruption, but not very likely.
________________________________
UNIX because I majored in cryptology...
nancy rippey
Trusted Contributor

Re: vgcfgrestore problem after replacing mirror

I opened up a TR with HP support and received the following response. I plan on installing the patch.

Thanks again for the quick update. I believe the problems you experienced were due to a patching issue. A potential problem was addressed in patch PHKL_29602:

CR:JAGae88760
After replacing a failed LVM disk, restoring it with vgcfgrestore(1M) and activating the volume group it belongs to - all according to the normal, recommended procedure - there is a high probability of silent data corruption if any mirrored logical volumes have data on that disk. What is supposed to happen is that the data gets marked "stale" and resync'ed from another mirror, but this isn't being done correctly.

The latest version of this patch is PHKL_30553. There are several dependencies associated with
the patch that will need to be installed as well. The ITRC patch database will resolve those dependencies for you and present you with a patch bundle which you can install. An alternative would be to install the latest quality pak patch bundle from the latest support plus cd. It's available for download from:

Thanks for all the responses
nrip