1833685 Members
3584 Online
110062 Solutions
New Discussion

vgchange -a y errors

 
SOLVED
Go to solution
Brian Killeen_1
Advisor

vgchange -a y errors

I have had an L-Class machine falling over and giving I/O errors on reboot - Sometimes it does not reboot at all - Reseating the boot drive seems to work which made me think that it might be a HDD problem. I mirrored the boot disk and the machine is now staying up. Running an ioscan shows the original boot disk now as NO_HW indicating a problem - I ran the following procedure then

vgcfgbackup /dev/vg00

Removed the drive
Installed a new drive

vgcfgrestore -n /dev/vg00 /dev/rdsk/c1t2d0

vgchange -a y

At this point I get an error saying

vgchange -a y
vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/c1t2d0":
Cross-device link
Volume group "/dev/vg00" has been successfully changed.
Volume group "/dev/vg01" has been successfully changed.


Running a dd command does not give errors and diskinfo is ok also....I am thinking that maybe the SCSI interface connector in the machine might be in trouble..Not sure - Has anyone seen this error before or know of a resolution....

HPUX 11.00
13 REPLIES 13
James R. Ferguson
Acclaimed Contributor

Re: vgchange -a y errors

Hi Brian:

Doing a 'vgcfgbackup' when you have a disk you can't access is not desirable.

Remember that the default action during LVM maintenance is to automatically run 'vgcfgbackup' unless explicitly overridden.

Cross-device link LVM errors occur when a physical volume belongs to more than one volume group. A 'strings /etc/lvmtab' will reveal this. If this is the case, do:

# mv /etc/lvmtab /etc/lvmtab.old
# vgscan -v

The other reason for cross-device link errors is a difference between the LVM information on the physical disk and as recorded in the 'etc/lvmtab' file itself. This can indicate corruption of the physical disk LVM information itself.

If this is the case, list the contents of the 'etc/lvmconf/vgNN.conf' file to validate it *and* see if the previous version is correct and can be applied:

# vgcfgrestore -n /dev/vgNN -l
# vgcfgrestore -f /etc/lvmconf/vgNN.conf.old -l

Regards!

...JRF...


Clemens van Everdingen
Honored Contributor

Re: vgchange -a y errors

Hi,

Try correcting the problem with this procedure.

The device filenames are examples !!

Try booting the system in LVM maintenance mode from the alternate root mirror disk to perform the LVM recovery.

Interrupt the boot process and from the command menu, type the boot command specifying the hardware path of the alternate boot disk
as in the following example where PATH is the physical path for the alternate boot disk:

bo PATH

Answer Y if prompted to interact with ISL.

At the ISL prompt, enter the hpux command to bootup into LVM maintenance mode using the no-quorum option:

ISL> hpux -lm -lq

Restore the LVM configuration to the replaced primary root disk from the vgcfgbackup of the root volume group LVM configuration.

For example, if c0t6d0 is the primary root disk replaced:

# vgcfgrestore -n /dev/vg00 /dev/rdsk/c0t6d0

Execute the mkboot and lvlnboot commands to add the boot information to the primary root disk and synchronize the BDRA as in
the following example:

# mkboot /dev/rdsk/c0t6d0
# mkboot -a "hpux" /dev/rdsk/c0t6d0
# lvlnboot -R /dev/vg00

Activate the root volume group:

# vgchange -a y /dev/vg00

will display "successfully changed" if the volume group is successfully activated.

NOTE: If the vgchange command fails with "couldn't attach to volume group" or "cross-device link" errors for other physical volumes belonging to the root volume group, use the vgcfgrestore command to
also restore the LVM configuration to each physical volume referenced in these error messages. Before executing the vgcfgrestore command, the following command can be executed to just display the LVM configuration
information from the /etc/lvmconf/vg00.conf file and verify that the physical volume in question is listed:

# vgcfgrestore -n vg00 -l

NOTE: The LVM kernel code detects if a failed Physical Volume (PV)has become available and will perform an automatic re-sync of the
extents on this PV when the vgchange command is executed with either the -a y or -a e options. If LVM doesn't see that the
PV failed, no automatic synchronization of any mirrorred logical volumes is performed, and in those instances, a vgchange must be followed by a vgsync as in the following example:

# vgsync /dev/vg00

After the mirrors are re-synchronized from the vgchange or the vgsync commands, reboot the system and verify that able to bootup from
the original root disk.


C.
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Brian Killeen_1
Advisor

Re: vgchange -a y errors

Thanks for your help guys...

James - a strings on the /etc/lvmtab file shows no conflicts between hardware addresses...I tried moving the lvmtab file and rerunning vgscan -v ....this generated a new file but only lists one drive attached to vg00 - A vgchange -a y works ok but it no longer attempts to access the problem drive and thus I cannot do a vgsync...

the vg00.conf.old file is also the same as the

/dev/vg00.conf file.....


Regarding booting the machine into LVM maintenance mode I would love to but the machine is mission critical and the work needs to be done online if at all possible....

Would it be possible to unmirror the drives and remirror the contents after...?



S.K. Chan
Honored Contributor

Re: vgchange -a y errors

I think your mistake is you should not run vgcfgbackup on vg00 before you remove and replace the disk. After you've run vgcfgbackup, the state of vg00 is backed up with only one known disk. So later when you attempt to vgcfgrestore to c1t2d0 you're in fact restoring the LVM info which excludes c1t2d0 as part of vg00, hence activating it causes problem. What does ..
# vgrestore -n vg00 -l
shows ? Does it shows 2 disk or one disk ? If it only shows 1 disk then my comment above is correct and you have to find an older or previously backed up vg00.conf file that shows both disks in order for this to work.
Brian Killeen_1
Advisor

Re: vgchange -a y errors

SK - Thanks for your idea..

the output from that command is as follows

# vgcfgrestore -n vg00 -l
Volume Group Configuration information in "/etc/lvmconf/vg00.conf"
VG Name /dev/vg00
---- Physical volumes : 2 ----
/dev/rdsk/c1t2d0 (Bootable)
/dev/rdsk/c1t0d0 (Bootable)

Both drives are available on this....
S.K. Chan
Honored Contributor

Re: vgchange -a y errors

Lets try to fix this .. I think more info would help. Can you provide these ..
# vgdisplay -v vg00
# lvdisplay -v /dev/vg00/lvolX
==> Any one of the existing lvol would do.
# strings /etc/lvmtab
==> Only need the vg00 part
# /etc/diskinfo /dev/rdsk/c1t2d0
# /etc/diskinfo /dev/rdsk/c1t0d0
# lvlnboot -v
I got the feeling we are dealing with a "ghost disk" situation at this point. Anyway your feedback/output would determine that.
S.K. Chan
Honored Contributor

Re: vgchange -a y errors

Sorry .. one more ..
# echo 0x2010?2X | adb /dev/dsk/c1t2d0
# echo 0x2010?2X | adb /dev/dsk/c1t0d0
this is to confirm if both disks have the same VGID.
Brian Killeen_1
Advisor

Re: vgchange -a y errors

I have attached a file with answers to your commands...

cheers

Brian
S.K. Chan
Honored Contributor

Re: vgchange -a y errors

Yes, definately "ghost disk". But before we fix this, the disk c1t2d0 seems to have a different VGID, has it been used in a different VG before ? And when you replace the disk did you run pvcreate on it ? Anyway ..
We need to get rid of c1t2d0, which is the ghost disk (since its extent shows ??? in lvdisplay output).
Run this .. (starting with the first lvol)
# lvdisplay -v -k /dev/vg00/lvol1 |more
==> You should see in the "PV1" and PV2" column the ??? and /dev/dsk/c1t0d0 are now being replace by a key value (either 0 or 1), I believe the ??? should be replace by 0 and the /dev/dsk/c1t0d0 by 1. If this is correct proceed ..
# lvreduce -m 0 -k /dev/vg00/lvol1 0
==> What we are doing here is reduce the "ghost extents" by using the key value.
# lvdisplay -v /dev/vg00/lvol1
==> Now the "problematic" PV should be out.
Repeat the "lvreduce" step above for the rest of the lvols. CAUTION, at each step before you lvreduce, run the lvdisplay -v -k first to make sure you're removing the right PV (ie using the right key value), though I'm sure all othe the othe lvols should have 0 as the key value, but better be sure than sorry.

Next step ..
# mv /etc/lvmtab /etc/lvmtab.org
# vgscan -v
# vgreduce -f vg00
==> Rebuild lvmtab and force removal of any missing PV.

At this point vg00 should be clean.
# vgdisplay -v vg00
==> The ActPV and CurPV should now match.
==> You should not see anymore stale extents.

If everything in the above goes well, you final step is to remirror you vg00 with c1t2d0 properly and I assume you have the steps for that ?




James R. Ferguson
Acclaimed Contributor

Re: vgchange -a y errors

Hi (again) Brian:

Several comments. I agree with SK. The VGID you posted are clearly different but since they were derived with 'adb' they are for what the kernel thinks it has. I prefer to read the disk directly with:

# INFO=`xd -An -j8200 -N16 -tx /dev/rdsk/cXtYdZ`
# VGID=`echo $INFO|awk '{print $3 $4}'`
# echo $VGID

In any event, I think you should plan a reboot; make sure that you can read *both* devices with 'diskinfo':

# diskinfo /dev/rdsk/c1t2d0
# diskinfo /dev/rdsk/c1t0d0

...and that these return non-zero size information indicating a good disk.

Then, repeat the process of recreating the 'etc/lvmtab' with 'vgscan' having moved the current version to ".old".

Then, use the LVM recovery guide (below) and follow the "Removing a Ghost Disk" section:

http://us-support3.external.hp.com/iv/data/documents/DE_SW_UX_swrec_EN_01_E/LVM.pdf

I favor a reboot to give the kernel a chance to collect fresh information about the I/O subsystem configuration.

Regards!

...JRF...
Brian Killeen_1
Advisor

Re: vgchange -a y errors

Had run a pvreduce prior to seeing this reply in an attempt to clean it - Also added in a clean drive so VGID's are now the same...Ran the lvreduce which worked a treat in removing the mirror copies except on one volume

/dev/vg00/lvol2 which is the swap parition - This is also the partition showing up as correctly synced in the
vgdisplay -v vg00

LV Name /dev/vg00/lvol2
LV Status available/syncd
LV Size (Mbytes) 1024
Current LE 256
Allocated PE 512
Used PV 1



Any ideas about getting around this - It looks like it may required reboot for that which I will need to organise - All other ideas welcome....
S.K. Chan
Honored Contributor
Solution

Re: vgchange -a y errors

What does ..
# lvdisplay -v /dev/vg00/lvol2
show ? It appears even though it only used 1 PV it seems to indicate all its extents are mirrored (ie allocated PE is twice than of current LE). Are you not able to reduce it even with its key value in the parameter ?
Brian Killeen_1
Advisor

Re: vgchange -a y errors

SK - lvol2 decided to play ball this morning - Perhaps I was mistyping although the same command worked for the other volumes...Either way I was able then to remove the Physical volume from the Volume Group and will be rebooting the machine at 12.30 today to make sure all is OK....Will let you know - thanks for your help on this issue....

later

Brian