Operating System - HP-UX
1845543 Members
2708 Online
110245 Solutions
New Discussion

Taming of the Shrew (Ghost Disk)

 
SOLVED
Go to solution
Ralph Grothe
Honored Contributor

Taming of the Shrew (Ghost Disk)

Hi,

I lost a mirror root/boot disk from vg00 on an ole' D-Class.
This must have upset the the system so much that it paniced and could hardly be convinced to reboot in normal runlevel even when selecting alternate boot device, and checking things in maintenace mode with no VGs activatet (e.g. it didn't want to mount /stand).
Anyway, I brought the system back to - so somewhat impaired - normal operation.
Now I encounter the usual trouble derriving from descrepencies between kernel and lvmtab meta data.
So when I got back from the console to my PC where I had access to the HP-UX Software Recovery Cookbook (i.e. Chpt. 16 LVM) I did so as outlined in the "Removing A Ghost Disk" section.
All the first hand hacks didn't help.
I was only to lvreduce the mirrors by the PV key hack (described therein).
lvdisplay isn't reporting the failed PV to be in the LVs anymore.

# lvdisplay -v /dev/vg00/lvol[1-9] 2>/dev/null|grep c1t0d0
#

The errors I swept under the carpet by redirection above all relate to the ghost disk and are repeatedly like this:

lvdisplay: Warning: couldn't query physical volume "/dev/dsk/c1t0d0":
The specified path does not correspond to physical volume attached to
this volume group
lvdisplay: Warning: couldn't query all of the physical volumes.

As there are no more LEs occupied from c1t0d0 even for LVM, thanks to the successful lvreduce, now I should be able (according to theory) to run "vgreduce -f vg00".
But this doesn't work, instead

# vgreduce -f vg00
vgreduce: Couldn't query physical volume "/dev/dsk/c1t0d0":
The specified path does not correspond to physical volume attached to
this volume group

I also tried in vain to have a new lvmtab created through vgscan.
But it won't get rid of c1t0d0.

I supplied a new disk for the failed one (hotswappable) and tried things like

pvcreate -f -B /dev/rdsk/c1t0d0

or

vgcfgrestore -n vg00 /dev/rdsk/c1t0d0

But these don't change a thing.

Any other hacks I could try?

Rgds.
Ralph
Madness, thy name is system administration
13 REPLIES 13
RAC_1
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Doing vgreduce -f vg00 should have taken care of it. What is the OS version??

what does the following command says?
strings /etc/lvmtab

Once again, do a lvdisplay on each of the lvs in vg00 and check if any of them list the faulty disk.

Anil
There is no substitute to HARDWORK
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

RAC,

meanwhile I'm a step farther.
I had to completely mv or remove the /etc/lvmtab, and rerun vgscan.
This led to a new lvmtab with no c1t0d0 in vg00

# strings /etc/lvmtab|head
/dev/vg00
U8)A
/dev/dsk/c0t5d0
/dev/dsk/c0t8d0
/dev/dsk/c0t9d0
/dev/dsk/c2t0d0
/dev/vg01
/dev/dsk/c1t1d0
/dev/dsk/c2t1d0
/dev/vg02

However, I missed an important step which reconciles the mismatch between kernel and LVM, i.e.

# vgchange -a y vg00
Volume group "vg00" has been successfully changed.

No errors now

# vgdisplay vg00 >/dev/null

# vgdisplay -v vg00|grep PV\ Name
PV Name /dev/dsk/c0t5d0
PV Name /dev/dsk/c0t8d0
PV Name /dev/dsk/c0t9d0
PV Name /dev/dsk/c2t0d0


Now I have to reintegrate the PV again...
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Hm, this annoyingly stubborn.
I still fail to succeed with the vgcfgbackup of vg00 :-(

# vgcfgbackup vg00
vgcfgbackup: /etc/lvmtab is out of date with the running kernel:Kernel indicates
5 disks for "/dev/vg00"; /etc/lvmtab has 4 disks.
Cannot proceed with backup.

Ok, force the replacement disk in...

# pvcreate -f -B /dev/rdsk/c1t0d0
Physical volume "/dev/rdsk/c1t0d0" has been successfully created.

# vgextend vg00 /dev/dsk/c1t0d0
Volume group "vg00" has been successfully extended.
vgcfgbackup: /etc/lvmtab is out of date with the running kernel:Kernel indicates
6 disks for "/dev/vg00"; /etc/lvmtab has 5 disks.
Cannot proceed with backup.

Madness, thy name is system administration
RAC_1
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

vgdisplay -v vg00

Check pvs in vg and active pvs.
Any mismatch there???

Anil
There is no substitute to HARDWORK
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Though vg00 now contains the replacement disk again, I still fail to write the old LIF header on it.

# strings /etc/lvmtab|head
/dev/vg00
U8)A
/dev/dsk/c0t5d0
/dev/dsk/c0t8d0
/dev/dsk/c0t9d0
/dev/dsk/c2t0d0
/dev/dsk/c1t0d0
/dev/vg01
/dev/dsk/c1t1d0
/dev/dsk/c2t1d0

# vgdisplay -v vg00|grep PV\ Name
PV Name /dev/dsk/c0t5d0
PV Name /dev/dsk/c0t8d0
PV Name /dev/dsk/c0t9d0
PV Name /dev/dsk/c2t0d0
PV Name /dev/dsk/c1t0d0

# vgcfgrestore -n vg00 /dev/rdsk/c1t0d0
vgcfgrestore: Mismatch between the backup file and the running kernel:
Kernel indicates 6 disks for "/dev/vg00"; /etc/lvmconf/vg00.conf indicates 5 dis
ks.
Cannot proceed with the restoration. Deactivate the Volume Group and try again.
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Yes RAC, the mismatch is still prevelant

# vgdisplay vg00|grep -e Cur\ PV -e Act\ PV
Cur PV 6
Act PV 5
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Strange mess now,
lvmtab and vgdisplay are in sync

# strings /etc/lvmtab|sed -n /vg00/,/vg01/p
/dev/vg00
U8)A
/dev/dsk/c0t5d0
/dev/dsk/c0t8d0
/dev/dsk/c0t9d0
/dev/dsk/c2t0d0
/dev/dsk/c1t0d0
/dev/vg01

# vgdisplay -v vg00|grep PV\ Name
PV Name /dev/dsk/c0t5d0
PV Name /dev/dsk/c0t8d0
PV Name /dev/dsk/c0t9d0
PV Name /dev/dsk/c2t0d0
PV Name /dev/dsk/c1t0d0


How can I force kernel structures into acknowledgement of new reality?
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Since I kicked off users anyway,
would a reboot resolve this mismatch?
Madness, thy name is system administration
RAC_1
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Seems that it is reboot time now. As said earlier, this should have been OK with lvreduce (for faulty disk), vgreduce.

Are you up to date on LVM patches??

Reboot

Anil
There is no substitute to HARDWORK
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Hi RAC,

I'm close to despairing on this ghost disk issue.
The reboot didn't help either.
Yesterday I brought the system in LVM maintenance mode (i.e. issueing "hpux -lm" on IPL) where there are no VGs activated and only a mini Root available.
There I successfully issued

/sbin/vgcfgrestore -n vg00 /dev/rdsk/c1t0d0

This had returned with no errors.
However, since the system is back in runlevel 3, although I now don't get those error messages when issuieng a simple vgdisplay of vg00, I still am confronted with the discrepency of actual and current PVs.

$ who -b
. system boot Aug 3 16:24

$ /usr/sbin/vgdisplay vg00|grep -E '^(Cur|Act) PV'
Cur PV 6
Act PV 5

And a vgcfgbackup, and thus any persistent change to the VG, is still impossible.


# vgcfgbackup vg00
vgcfgbackup: /etc/lvmtab is out of date with the running kernel:Kernel indicates
6 disks for "/dev/vg00"; /etc/lvmtab has 5 disks.
Cannot proceed with backup.

Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Oh, as for patches,
this D-Class system is soon to be replace by rp5430 (the hardware has already been delivered), and therefore I wouldn't like to put too much deliberation in the patching stuff on this particular box.

Albeit this ghost disk problem does bother me, as I fear something like it could well be happening on any other of our HP-UX boxes, while I still haven't discovered a working hack how to cope with it.
Isn't there an HP affiliated LVM guru around who could come up with a solution?


Some LVM patches on this D370:

# swlist -l product|grep -i lvm
LVM B.11.00 LVM
PHCO_19479 1.0 LVM commands cumulative patch
PHCO_20870 1.0 LVM commands cumulative patch
PHKL_20333 1.0 LVM Cumulative patch
PHKL_20419 1.0 LVM Cumulative patch
Madness, thy name is system administration
Dietmar Konermann
Honored Contributor
Solution

Re: Taming of the Shrew (Ghost Disk)

Ralph,

in general you need to distinguish 2 situations.

1) lvmtab contains PVs that are not part of the VG (as recored in the LVM structures). Then you need to get rid of the lvmtab entries, e.g. using vgscan after moving the old lvmtab out of the way.

2) VG (as recorded in the LVM structures on disk) contains PVs that are not listed in lvmtab. If you don't need dataf rom the missing PV, then evacuate it (from LVM perspective, you already did so by using lvreduce with PV keys) and then run vgreduce -f VG.

So, you should be able to finally solve your problem with vgreduce -f VG now.

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)
Ralph Grothe
Honored Contributor

Re: Taming of the Shrew (Ghost Disk)

Did I really forget the vgreduce -f ? %-{


# vgreduce -f vg00
PV with key 4 sucessfully deleted from vg vg00
Repair done, please do the following steps.....:
1. save /etc/lvmtab to another file
2. remove /etc/lvmtab
3. use vgscan(1m) -v to re-create /etc/lvmtab
4. NOW use vgcfgbackup(1m) to save the LVM setup

# mv /etc/lvmtab /etc/lvmtab.$(date +%Y%m%d%H%M)

# vgscan -v
Creating "/etc/lvmtab".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg00
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg01
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg02
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg03
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg04
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg05
".
vgscan: Couldn't access the list of physical volumes for volume group "/dev/vg06
".
Couldn't stat physical volume "/dev/dsk/c3t2d0":
Invalid argument

/dev/vg00
/dev/dsk/c0t5d0
/dev/dsk/c0t8d0
/dev/dsk/c0t9d0
/dev/dsk/c1t0d0
/dev/dsk/c2t0d0

/dev/vg01
/dev/dsk/c1t1d0
/dev/dsk/c2t1d0

/dev/vg02
/dev/dsk/c1t2d0
/dev/dsk/c2t2d0

/dev/vg03
/dev/dsk/c1t3d0
/dev/dsk/c2t3d0



/dev/vg04
/dev/dsk/c1t4d0
/dev/dsk/c2t4d0

/dev/vg05
/dev/dsk/c1t5d0
/dev/dsk/c2t5d0

/dev/vg06
/dev/dsk/c1t6d0
/dev/dsk/c2t6d0

Following Physical Volumes belong to one Volume Group.
Unable to match these Physical Volumes to a Volume Group.
Use the vgimport command to complete the process.
/dev/dsk/c1t15d0
/dev/dsk/c2t15d0


# vgcfgbackup vg00
Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf

# /usr/sbin/vgdisplay vg00|grep -E '^(Cur|Act) PV'
Cur PV 5
Act PV 5

Madness, thy name is system administration