Operating System - HP-UX
1825775 Members
2349 Online
109687 Solutions
New Discussion

vg00 - lvol1 won't re-sync after disk fail

 
SOLVED
Go to solution
Guy Humphreys
Valued Contributor

vg00 - lvol1 won't re-sync after disk fail

Dear all,

I had a root vol disk fail on me this morning. One stale extent was on the disk. No problem I thought, just replace the disk and do a vgcfgrestore. Here's what I did:

remove faulty disk
ioscan -fnC disk
rmsf -v -a /dev/rdsk/c3t15d0
put new disk in
ioscan -fnC disk
insf -H (some hardware address)
ioscan again to check - OK
mkboot /dev/rdsk/c3t15d0
mkboot -a "hpux -lq" /dev/rdsk/c3t15d0
vgcfgrestore -n /dev/vg00 /dev/rdsk/c3t15d0
vgchange -a y /dev/vg00
vgsync /dev/vg00

lvols 3 and 2 sync up OK but on lvol1 I get an error message as below:

vgsync: Couldn't re-synchronize stale partitions of the logical volume
vgsync: Couldn't re-synchronize logical volume "/dev/vg00/lvol1".
vgsync: Couldn't re-synchronize volume group "vg00".

I had a quick peruse of this forum and thought that my replacement disk might also be faulty (despite having never been used before!) so I found another one and gave this one a go - exact same problem!!

I have tested this latest disk with dd and got the below response:

dd if/dev/rdsk/c3t15d0 of=/dev/null bs=1024k
4095+1 records in
4095+1 records out

to me that says the disk is good.

What is going wrong?? do I need to lvreduce all the lvols and then re-create and re-mirror manually?

thanks for any adivce
Guy




'If it ain't broke, don't fix it!'
7 REPLIES 7
Torsten.
Acclaimed Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

What gives

# lvdisplay -v /dev/vg00/lvol1|grep stale

BTW,
in your command

mkboot /dev/rdsk/c3t15d0

the "-l" option is missing, I guess. But it's not related to your problem.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Guy Humphreys
Valued Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

Hi Torsten,

this is what I get:

$lvdisplay -v /dev/vg00/lvol1 |grep stale
LV Status available/stale
0000 /dev/dsk/c1t15d0 0000 current /dev/dsk/c3t15d0 0000 stale

cheers
Guy
'If it ain't broke, don't fix it!'
Torsten.
Acclaimed Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

Another bad disk?

Have a look at the syslog and run the diagnostic info tool for any write errors.

It could be a bad block, but bad block relocation is not allowed in lvol1.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Guy Humphreys
Valued Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

this is from syslog:

Apr 27 12:34:11 spnm vmunix: LVM: vg[0]: pvnum=1 (dev_t=0x1c03f000) is POWERFAILED
Apr 27 12:34:11 spnm vmunix: Disk at 10/4/12.15.0 is not responding. Check device, power, and cables.

I would be surprised if it was the disk, this would be the third disk in total I have tried with the same problem.

Also how come, I can lvdisplay all the other lvms?

pvdisplay -v output:

--- Physical volumes ---
PV Name /dev/dsk/c3t15d0
VG Name /dev/vg00
PV Status available
Allocatable yes
VGDA 2
Cur LV 9
PE Size (Mbytes) 4
Total PE 1023
Free PE 7
Allocated PE 1016
Stale PE 892
IO Timeout (Seconds) default

--- Distribution of physical volume ---
LV Name LE of LV PE for LV
/dev/vg00/lvol3 50 50
/dev/vg00/lvol2 50 50
/dev/vg00/lvol1 25 25
/dev/vg00/lvol8 276 276
/dev/vg00/lvol5 78 78
/dev/vg00/lvol9 323 323
/dev/vg00/lvol6 24 24
/dev/vg00/lvol4 5 5
/dev/vg00/lvol7 185 185

--- Physical extents ---
PE Status LV LE
0000 stale /dev/vg00/lvol1 0000
0001 current /dev/vg00/lvol1 0001
0002 current /dev/vg00/lvol1 0002
0003 current /dev/vg00/lvol1 0003
0004 current /dev/vg00/lvol1 0004
0005 current /dev/vg00/lvol1 0005
0006 current /dev/vg00/lvol1 0006
0007 current /dev/vg00/lvol1 0007
0008 current /dev/vg00/lvol1 0008
0009 current /dev/vg00/lvol1 0009


only the first sector is stale!
could it be the bay and not the disk?

Guy
'If it ain't broke, don't fix it!'
Robert-Jan Goossens
Honored Contributor
Solution

Re: vg00 - lvol1 won't re-sync after disk fail

Hi Guy,

I would try to recreate your mirror and use the pvcreate -f to force the creation of new resource areas on the disk.

--
# for X in 1 2 3 4 5 6 7 8
> do
> lvreduce -m 0 /dev/vg00/lvol${X} /dev/dsk/c3t15d0
> done

# vgreduce /dev/vg00 /dev/dsk/c3t15d0

# dd if=/stand/vmunix of=/dev/rdsk/c3t15d0 bs=4096

# pvcreate -f -B /dev/rdsk/c3t15d0

# mkboot /dev/rdsk/c3t15d0

# mkboot -a "hpux -lq (;0)/stand/vmunix" /dev/rdsk/c3t15d0

# vgextend /dev/vg00 /dev/dsk/c3t15d0

# for X in 1 2 3 4 5 6 7 8
>do
> lvextend -m 1 /dev/vg00/lvol${X} /dev/dsk/c3t15d0
> done

# lvlnboot -b /dev/vg00/lvol1

# lvlnboot -s /dev/vg00/lvol2

# lvlnboot -r /dev/vg00/lvol3

# lvlnboot -v

# setboot

Best regards,
Robert-Jan
Guy Humphreys
Valued Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

Thanks Robert - I have followed the manual process and it has worked.

However I did get an I/O error on the dd of /stand/vmunix?

is this needed?
I usually follow the procedure set out in the "Replacing boot mirror disk" doc from Vantive (17/12/2003) this does not mention the /stand/vmunix part, is it needed?

cheers
Guy
'If it ain't broke, don't fix it!'
TwoProc
Honored Contributor

Re: vg00 - lvol1 won't re-sync after disk fail

This doesn't fix your problem, but I noticed something in your post.

In my steps on hot swap disks, I've never fooled with running rmsf, and insf when I'm going back immediately into the same exact slot. It's not necessary if you're going back into to the same hardware location (as you did in this case).

rmsf -v -a /dev/rdsk/c3t15d0
put new disk in
ioscan -fnC disk
insf -H (some hardware address)

The above could have just been:
pull disk
wait a while (minute or so),
insert new one,
ioscan -fnC disk to see it's there.

mkboot /dev/rdsk/c3t15d0
mkboot -a "hpux -lq" /dev/rdsk/c3t15d0
*vgcfgrestore -n /dev/vg00 /dev/rdsk/c3t15d0
vgsync vg00

*Note: that in my experience on HPUX 11i, the vgcfgrestore can be skipped, and just run the vgsync, it will restore the vgcfg for you. I've found this out before by forgetting to run it. It seems that this feature was made available sometime in 11i (don't think it was there in 11.0).

We are the people our parents warned us about --Jimmy Buffett