Operating System - HP-UX
1748202 Members
2965 Online
108759 Solutions
New Discussion юеВ

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

 
SOLVED
Go to solution
Jeff Patrick
Advisor

HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Getting conflicting info from HP HW and SW support...

Background...system lost power tonight, and upon rebooting, vg02 did not activate.
vg02 consists of 2 hard disks, with mirroring in effect, /dev/dsk/c3t2d0 and /dev/dsk/c7t2d0.

On bootup, /dev/dsk/c7t2d0 does not respond to any command, including dd, diskinfo, pvdisplay, vgcfgrestore, etc., but does show up as 'CLAIMED' on ioscan.

vgchange would not work on vg02 due to this messed up disk. HP had me do a no-quorum vgchange as follows: vgchange -a y -q n /dev/vg02, which took a while, then after giving an error about the bad disk, said vg02 was activated successfully. Now, my lvdisplay of /dev/vg02/uvol2 shows the following:

lvdisplay -v /dev/vg02/uvol2--- Distribution of logical volume ---
PV Name LE on PV PE on PV
/dev/dsk/c3t2d0 508 508

--- Logical extents ---
LE PV1 PE1 Status 1 PV2 PE2 Status 2
00 /.../c3t2d0 00 stale ??? 00 current
01 /.../c3t2d0 01 stale ??? 01 current

Note: the good disk (c3t2d0) shows stale, while the bad disk, which it can't even name (c7t2d0) shows current--for all extents.

lvdisplay -v -k /dev/vg02/uvol2 shows:

--- Logical extents ---
LE PV1 PE1 Status 1 PV2 PE2 Status 2
00 0 00 stale 1 00 current
01 0 01 stale 1 01 current

Is there ANY way to salvage these 'stale' extents on the good disk? I can't afford to lose data, and what's the point of having Mirrordisk and mirroring configured if one bad hard disk can wipe out my data anyway? Thanks for any help!!!!

Jeff
8 REPLIES 8
Peggy Fong
Respected Contributor
Solution

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Jeff

Output is a little strange. Is the file system mounted and usable. I ask because the output seems to show that LVM thinks one disk has stale extents and the other disk it cannot even recognize. This is not a normal condition with mirroring. For example, if this were hot-pluggable and you were absolutely sure that c3t2d0 is the bad disk then you would reduce your mirrors:
# lvreduce -A n -m 0 /dev/vg02/uvol2 /dev/dsk/c3t2d0 (using the disk that is bad).

You can run lvlnboot -R /dev/vg02 and see if that will correct the readout on the ????

If c3t2d0 is actually the bad disk and you do this then you've lost your data.

If the disk with the ??? is the bad one then you're hosed - no way to recover the data.

Once you've reduced the LVM data off the bad disk then if hot-pluggable the disk can be replaced. Then you need to rebuild the disk from scratch and c3t2d0 will become the mirror.

Another option is to halt the system replace c3t2d0. Go to single-user mode. Run:
vcfgrestore -n vg02 /dev/rdsk/c3t2d0
vgchange -a y vg02
If good, then reboot to multi-user mode.

Good Luck.
Peggy
Peggy Fong
Respected Contributor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Jeff
I read your note too quickly and now see that you already said that c7t2d0 is the disk that does not respond and the other disk has the stale extents. There was another posting where someone wanted to get there data back. It's a shot in the dark but it might work. I'll see if I can find that post. Sorry for the bad luck.
Peg
Peggy Fong
Respected Contributor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Jeff
I found the posting that might help. It is:
http://forums.itrc.hp.com/cm/QuestionAnswer/1,1150,0x7cc06af52b04d5118fef0090279cd0f9,00.html

Just in case you cannot get to it I will copy it here:
Problem:
We replaced a disk with a larger one some time ago. We did not transfer some old information to the new disk, but now the users want it back and our backup tapes have aged out. The disk has had the LV removed using SAM but nothing else (i.e. VG still defined, etc). Is there any way to nondestructively re-define the LV definition to allow access to the data on the disk?
Thanks

Answer by Andy Monks (HP)
yes there is, however it might take a few attempts.

If it was the last lvol in the volume group, just doing a lvcreate of a size greater or equal to was it was should do.

If it was in the middle, then you'll have to know the size, but as long as no other lvremoves have been done, again a lvcreate (of the exact size) should work.

none of the lvm command will actually touch the data, so it doesn't matter how many times you attempt this.

What you quite often have to do, is create a fake lvol just so that you can get lvm to create a new lvol where the old one was (this is only if you've deleted a 2 or move lvols).

This worked for the person requesting the info.

You would have to adapt this to your situation.

I think I would get a complete listing of the lvols and sizes, etc. on the disk.

Reduce the lvol from c7t2d0. Example:
lvreduce -k -m 0 /dev/vg02/uvol2 1

All that you have now are the stale extents on the disk c3t2d0. lvremove that lvol and then re-create it. Since it won't have a mirror it should come up as current. If you get it exactly in the position it was before, you might recovery your data.

Again, Good Luck.
Let me know what happens.
Peggy
Jeff Patrick
Advisor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Peggy,

Thank you so much for all the info you've given me...HP's techs are working on the case right now...I like your idea about lvremoving & lvcreating the lvol to try to salvage the data--there's only 1 lvol per disk. Before HP goes further, would you recommend that I dd the entire good (stale) disk out for safe keeping? Thanks again.
Jeff Patrick
Advisor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Peggy, again thank you so much for all your help. I went ahead and dd'd it out just for the heck of it, and then we replaced the bad drive and resync'd, and everything came up. I did have to fsck the lvol in question, as it said it was corrupt when I tried to mount the first time, and then it mounted fine & I could see all my files. Don't ask me how it sync'd using the 'stale' extents, especially when I had HP OS support people telling me there was no way to get them back, but the standard Mirrordisk stuff apparently did its job quite well & I lost no data...Thank you again for all your help.

-Jeff
Peggy Fong
Respected Contributor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Jeff
What a great idea to dd the disk - I hadn't thought of that one. At least you didn't need it. So, did you do a vgcfgrestore on the replaced disk? Pretty amazing. Normally mirror software works great but every once in awhile you see some bizarre stuff. I am really glad it worked out without you having to do any extreme measures....
and thanks for all the points (you didn't have to do that!)

Peggy
Les Schuettpelz
Frequent Advisor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Dave Fargo using Les's ITRC login..

This one is unusual, because the survivor was marked 'stale', I think this means some information didn't get synched to the disk, possibly you are using async IO?

Doing an ioscan -f should have normalized the information for the dead disk, it should be able to recognize NO_HW there, not sure if this was tried.

Once this is done, the only way I know to clear up the ??? problem is to do vgreduce -f followed by vgscan. If LVM and the kernel are still out of sync, I think rmsf -a on the dead disk devname is also helpful, then another vgreduce -f / vgscan might have worked.

This said with the quailfier that the 'stale' doesn't match our experience, but apparently you were able to recover by other means and any corruption was repairable. I am assuming support has looked over your OS and patches. Good luck.
Jeff Patrick
Advisor

Re: HELP!!! SYSTEM DOWN-- MAY LOSE DATA

Peggy--no problem on the points, I found your help quite valuable, especially at such a stressful time when I was receiving lots of different information/suggestions. And, to be quite honest, I didn't even know if I'd get any responses on the forum that late at night...also, I did do a vgcfgrestore after the disk was replaced, then as soon as I ran vgsync, it automatically saw the previously 'stale' extents as 'current' and synced the new disk using them. I'm still kind of amazed by it.

Dave, thanks also for your help...I did try several lvm commands, i.e. lvreduce & vgreduce, but it wouldn't let me do either because of the state the disk was in. Also, I did run ioscan -f, several times, and each time it showed the disk in question as if there were nothing wrong, i.e. CLAIMED, even though I couldn't do anything else with it.

Thanks again to both of you...

Jeff