System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Missing Volume Group after reboot

 
SOLVED
Go to solution
David Cramblett
Occasional Advisor

Missing Volume Group after reboot

I rebooted our 11.11 system today and when it came back online, one of the volume groups was missing (vg05).

 

This disk is a RAID 0/1 LUN on a FC60 disk array. amdsp shows that all LUN's onthe array are in good order.

 

I tried to activate the volume group.  I ran:

 

# vgchange -a y vg05

 

vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/c5t0d6":

I/O error
vgchange: Warning: couldn't query physical volume "/dev/dsk/c5t0d6":
The specified path does not correspond to physical volume attached to
this volume group
vgchange: Warning: couldn't query all of the physical volumes.
vgchange: Couldn't activate volume group "vg05":
Quorum not present, or some physical volume(s) are missing.

 

 

I then ran an ioscan to make sure the disk was avilable:

 

# ioscan -fnC disk

 

Class I H/W Path Driver S/W State H/W Type Description
==========================================================================
disk 0 0/0/2/0.6.0 sdisk CLAIMED DEVICE SEAGATE ST336704LC
/dev/dsk/c1t6d0 /dev/rdsk/c1t6d0
disk 1 0/0/2/1.6.0 sdisk CLAIMED DEVICE SEAGATE ST336704LC
/dev/dsk/c2t6d0 /dev/rdsk/c2t6d0
disk 9 0/2/0/0.8.0.4.0.0.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d0 /dev/rdsk/c5t0d0
disk 11 0/2/0/0.8.0.4.0.0.1 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d1 /dev/rdsk/c5t0d1
disk 26 0/2/0/0.8.0.4.0.0.2 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d2 /dev/rdsk/c5t0d2
disk 13 0/2/0/0.8.0.4.0.0.3 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d3 /dev/rdsk/c5t0d3
disk 18 0/2/0/0.8.0.4.0.0.4 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d4 /dev/rdsk/c5t0d4
disk 28 0/2/0/0.8.0.4.0.0.5 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d5 /dev/rdsk/c5t0d5
disk 30 0/2/0/0.8.0.4.0.0.6 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d6 /dev/rdsk/c5t0d6
disk 32 0/2/0/0.8.0.4.0.0.7 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t0d7 /dev/rdsk/c5t0d7
disk 19 0/2/0/0.8.0.4.0.1.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t1d0 /dev/rdsk/c5t1d0
disk 21 0/2/0/0.8.0.4.0.2.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t2d0 /dev/rdsk/c5t2d0
disk 22 0/2/0/0.8.0.4.0.3.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c5t3d0 /dev/rdsk/c5t3d0
disk 10 1/0/0/0.8.0.5.0.0.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d0 /dev/rdsk/c6t0d0
disk 12 1/0/0/0.8.0.5.0.0.1 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d1 /dev/rdsk/c6t0d1
disk 25 1/0/0/0.8.0.5.0.0.2 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d2 /dev/rdsk/c6t0d2
disk 14 1/0/0/0.8.0.5.0.0.3 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d3 /dev/rdsk/c6t0d3
disk 15 1/0/0/0.8.0.5.0.0.4 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d4 /dev/rdsk/c6t0d4
disk 27 1/0/0/0.8.0.5.0.0.5 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d5 /dev/rdsk/c6t0d5
disk 29 1/0/0/0.8.0.5.0.0.6 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d6 /dev/rdsk/c6t0d6
disk 31 1/0/0/0.8.0.5.0.0.7 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t0d7 /dev/rdsk/c6t0d7
disk 16 1/0/0/0.8.0.5.0.1.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t1d0 /dev/rdsk/c6t1d0
disk 17 1/0/0/0.8.0.5.0.2.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t2d0 /dev/rdsk/c6t2d0
disk 20 1/0/0/0.8.0.5.0.3.0 sdisk CLAIMED DEVICE HP A5277A
/dev/dsk/c6t3d0 /dev/rdsk/c6t3d0

The disk appeared to be online. Next I ran strings on the lvmtab to list the path for vg05:

 

# strings /etc/lvmtab

/dev/vg05
/dev/dsk/c5t0d0
/dev/dsk/c5t0d6
/dev/vg00
/dev/dsk/c1t6d0
/dev/vg10
/dev/dsk/c2t6d0
/dev/vg01
/dev/dsk/c5t0d1
/dev/vg07
/dev/dsk/c5t0d2
/dev/vg03
/dev/dsk/c5t0d3
/dev/vg02
/dev/dsk/c5t0d4
/dev/vg06
/dev/dsk/c5t0d5
/dev/vg12
/dev/dsk/c5t0d7

 

 

I notice that it is showing two paths for vg05. This disk is a RAID 0/1 Array on a FC60 and there should only be one path. SAM says the hardware path should be /0/2/0/0.8.0.4.0.0.6   and lists it as "Unused" rather than "LVM" like I would expect. The other path could be the second controler on the array I suppose. Although none of the other VG's show two paths in lvmtab.

 

 

I also ran diskinfo for both paths listed in the lvmtab:

 

# diskinfo /dev/rdsk/c5t0d0


SCSI describe of /dev/rdsk/c5t0d0:
vendor: HP
product id: A5277A
type: direct access
size: 71004510 Kbytes

bytes per sector: 512


# diskinfo /dev/rdsk/c5t0d6


SCSI describe of /dev/rdsk/c5t0d6:
vendor: HP
product id: A5277A
type: direct access
size: 0 Kbytes
bytes per sector: 0

 

I am wondering if I can backup the lvmtab and then scan to recreate it?

 

Thanks,

 

David

 

 

 

4 REPLIES
DeafFrog
Valued Contributor

Re: Missing Volume Group after reboot

Hi ,

you can backup the existing lvmtab and do a vgscan to rebuild.
FrogIsDeaf
Matti_Kurkela
Honored Contributor
Solution

Re: Missing Volume Group after reboot

A disk may fail in more than one way. Yesterday we replaced a modern SAS disk which was still CLAIMED, yet produced errors when attempting to read it.

Another reason might be that the disk is OK but someone has accidentally overwritten the data on it (for example, by running pvcreate -f on the disk).

 

The PVs of your vg05 are c5t0d0 and c5t0d6. The hardware paths indicate that both PVs are on controller with loop ID 4, and on the same virtual SCSI bus (bus 0). It does not look like a path through the second controller would.

 

It is more likely that your vg05 has been originally set up on c5t0d0, but then mirrored or extended to c5t0d6.

If your VG is mirrored, you might be able to activate it by disabling the quorum requirement:

vgchange -a y -q n vg05

 This will allow the VG to be activated even if one half of the mirror is lost. But if the VG is not mirrored, then this command will not be successful.

 

The fact that SAM says hardware path /0/2/0/0.8.0.4.0.0.6 is "Unused" might be because SAM is not seeing a valid LVM header on that disk. This might be caused by disk failure or data corruption.

 

In this situation, I would NOT just assume that lvmtab is in error.

 

First, I would recommend testing that the c5t0d6 disk is actually readable:

dd if=/dev/rdsk/c5t0d6 of=/dev/null bs=1024k

 This command will attempt to read through the entire disk. If it completes without error messages (only reporting how many blocks it read & outputted to /dev/null), the disk is probably mechanically and electronically OK, and the problem might be that the LVM header on the disk has been corrupted.

 

But if the dd command outputs an error message, then the disk might be faulty, although it is still trying to respond to commands (and that's why it still shows as CLAIMED).

 

If the dd test goes through without errors, you might want to try and restore the LVM configuration data to the disk:

vgcfgrestore -n vg05  /dev/rdsk/c5t0d6

 After that, you can try to activate the VG again, with "vgchange -a y vg05". But if more than just the LVM header has been overwritten/corrupted, your filesystem on the disk might be damaged too.

 

 

MK
David Cramblett
Occasional Advisor

Re: Missing Volume Group after reboot

Thanks for your response, it's much appreciated.

 

I did some more digging and realized that vg05 was made up of two concatenated RAID 0/1 LUNs on the array (LUN 0 and 6, hence the two disk paths.)

 

I started down the path of testing the disk read capability as you suggested, but found this morning that my array controller A could not even be communicated with.

 

Before heading down the long road of diagnosing exactly which component between the server and the disk controller had failed, I figured I would try another reboot, along with a disk system reset this time.

 

The reset and restart fixed the problem and the system is back online!

 

Thanks again for taking the time to respond.

Matti_Kurkela
Honored Contributor

Re: Missing Volume Group after reboot

In that case, you might want to check the array controller firmware versions and see if there are any updates available.

If there are updates, reading the release notes for all firmware versions newer than what you currently have might be enlightening.

MK