- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: vgchange activation error
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-19-2011 07:22 AM
тАО05-19-2011 07:22 AM
We have a db02 server that can see and use the same SAN DASD and I can activate it and
mount the file system there okтАж
The issue started when db01 crashed and rebooted and must of set some flag some where that
is now preventing it from startin up againтАж.
IтАЩve tried doing a vgexport on db02 and an import on db01 but I still get the same errorтАж
Any ideas ??
[root@db01 ~]# /sbin/vgchange -a y datavg1
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
Not activating datavg1/lvol01 since it does not pass activation filter.
0 logical volume(s) in volume group "datavg1" now active
[root@db01 ~]#
[root@db01 ~]# lvdisplay -v /dev/datavg1/lvol01
Using logical volume(s) on command line
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
--- Logical volume ---
LV Name /dev/datavg1/lvol01
VG Name datavg1
LV UUID e2DFlG-CweU-zVsV-wzfs-oDwY-IpUF-JKgHKt
LV Write Access read/write
LV Status NOT available
LV Size 1000.00 GB
Current LE 256000
Segments 4
Allocation inherit
Read ahead sectors auto
[root@awopdb01 ~]#
[root@db01 ~]# vgimport datavg1
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
Volume group "datavg1" successfully imported
[root@db01 ~]# /sbin/vgchange -a y datavg1
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
Not activating datavg1/lvol01 since it does not pass activation filter.
0 logical volume(s) in volume group "datavg1" now active
[root@db01 ~]#
But when I activate it on the second server it works ok:
[root@db02 ~]# /sbin/vgchange -a y datavg1
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
1 logical volume(s) in volume group "datavg1" now active
[root@db02 ~]#
[root@db02 ~]# ls -al /dev/datavg1
total 0
drwxr-xr-x 2 root root 60 May 18 19:45 .
drwxr-xr-x 17 root root 7280 May 18 19:45 ..
lrwxrwxrwx 1 root root 26 May 18 19:45 lvol01 -> /dev/mapper/datavg1-lvol01
[root@db02 ~]#
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-19-2011 11:33 PM
тАО05-19-2011 11:33 PM
SolutionDoes this system have any kind of cluster suite installed? (Serviceguard? RedHat Cluster Suite? DB2 cluster? Something else?)
What is the output of these commands:
grep -e filter -e volume_list /etc/lvm/lvm.conf
vgs -o +tags
lvs
pvs
If /etc/lvm/lvm.conf contains an uncommented filter expression that is different from the default value:
filter = [ "a/.*/" ]
... or an uncommented "volume_list" definition, then it's probably been added there for a reason: don't change it until you understand why the current value is there.
The activation filter and/or VG tags are often used as a part of a cluster interlock mechanism that stops the cluster node from activating a VG that is in use by another cluster node. (If the particular VG is supposed to be accessed by more than one node simultaneously, then the lockout is designed to prevent *uncoordinated* access: cluster nodes must be able to communicate with each other to be aware of what the other nodes are doing. The nodes must coordinate their actions so that one node does not accidentally use an stale cached copy of some record when another node has just updated it.)
If this is what is stopping you from activating the VG, it probably means that some sort of cluster infrastructure process did not automatically start up when db01 was rebooted; when you find it and start it, it might automatically fix this problem for you.
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-20-2011 03:35 AM
тАО05-20-2011 03:35 AM
Re: vgchange activation error
The problem started when db01 failed when it lost contact with the quorum disk...
I believe the cluster tried to start on db02 but failed with the same quorum disk issue, since it lost the connectivity and must of set some kind of lock or tag that is preventing it to activate on db01..
I was able to mount it manually on db02 as the SAN group resolves the issue with the quorum disk..
Just need to figure out what is preventing it from starting up on db01 so we can get the cluster back up and going again...
[root@awopdb01 ~]# grep -e filter -e volume_list /etc/lvm/lvm.conf
# A filter that tells LVM2 to only use a restricted set of devices.
# The filter consists of an array of regular expressions. These
# Don't have more than one filter line active at once: only one gets used.
#filter = [ "a/.*/" ]
#filter = [ "r|/dev/sdr/|", "r|/dev/sdi/|" ]
#filter = [ "a|/dev/sda.*|", "a|/dev/mpath/.*|", "r/.*/" ]
# filter = [ "r|/dev/cdrom|" ]
# filter = [ "a/loop/", "r/.*/" ]
# filter =[ "a|loop|", "r|/dev/hdc|", "a|/dev/ide|", "r|.*|" ]
# filter = [ "a|^/dev/hda8$|", "r/.*/" ]
# The results of the filtering are cached on disk to avoid
# If volume_list is defined, each LV is only activated if there is a
# volume_list = [ "vg1", "vg2/lvol1", "@tag1", "@*" ]
volume_list = [ "VolGroup00", "@awopdb01" ]
[root@awopdb01 ~]# vgs -o +tags
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
VG #PV #LV #SN Attr VSize VFree VG Tags
VolGroup00 2 6 0 wz--n- 680.34G 80.25G
datavg1 4 1 0 wz--n- 1000.00G 0 awopdb02
[root@awopdb01 ~]# lvs
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
LogVol00 VolGroup00 -wi-ao 100.00G
LogVol01 VolGroup00 -wi-ao 192.00G
LogVol02 VolGroup00 -wi-ao 192.00G
lvol1 VolGroup00 -wi-ao 46.09G
lvol2 VolGroup00 -wi-ao 50.00G
lvol3 VolGroup00 -wi-ao 20.00G
lvol01 datavg1 -wi--- 1000.00G
[root@awopdb01 ~]# pvs
Found duplicate PV tbU5yWceVhgPgS6RIvj0M2TxChiLp61b: using /dev/sdr2 not /dev/sdb2
PV VG Fmt Attr PSize PFree
/dev/mpath/mpath11 datavg1 lvm2 a- 250.00G 0
/dev/mpath/mpath12 datavg1 lvm2 a- 250.00G 0
/dev/mpath/mpath13 datavg1 lvm2 a- 250.00G 0
/dev/mpath/mpath14 datavg1 lvm2 a- 250.00G 0
/dev/mpath/mpath1p2 lvm2 a- 267.75G 267.75G
/dev/sda2 VolGroup00 lvm2 a- 408.09G 0
/dev/sda3 VolGroup00 lvm2 a- 272.25G 80.25G
[root@awopdb01 ~]#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-20-2011 06:30 AM
тАО05-20-2011 06:30 AM
Re: vgchange activation error
In this configuration, the cluster VG can only be active on one of the cluster nodes at a time, or on none at all: never on two or more nodes simultaneously. The activation of the cluster VG is controlled by the cluster suite.
Which version of RHEL? The RedHat Cluster Suite has changed a lot between versions.
The instructions below assume RHEL 5, but should be mostly compatible with RHEL 4 or 6 too. The HA LVM mode is available on RHEL 4.5 and newer.
Your datavg1 volume group currently has a tag "awopdb02" on it - meaning the VG is currently in use on awopdb02 (or the cluster suite had it active there when the system crashed).
It is a cluster volume group, so *you should not activate it* manually in a normal situation - the cluster suite will activate it if (and only if) appropriate checks are successful. *It is not an error* that you cannot activate the VG - that is the cluster safety system doing its job.
If both nodes failed to reach the quorum disk, that means both nodes should have noticed they've lost quorum and rebooted - is this what happened? That's what a cluster *should* have done in that situation.
First, you should run "clustat" and "cman_tool status" on both nodes.
- Are both nodes "online" in the clustat listing? (if not, the node that is not "online" should not activate datavg1 unless the cluster daemons have been completely stopped on both nodes AND the sysadmin has verified the other node does not have it active.)
- Does "clustat" say "Member Status: Quorate" on both nodes? (If not, the node that is not quorate should not activate datavg1...[see above])
- What's the state of the cluster services in the clustat listing? (If the nodes are online but the services are stopped, then datavg1 should not be activated anywhere.)
- In the "cman_tool status" listings, are the values of "Config Version", "Cluster Id" and "Cluster Generation" the same in both nodes?
(If not, and both nodes are online as per clustat, then *you're in a split-brain situation*: both nodes are thinking "I'm OK, the other node is not.")
My recommendation:
1.) Undo all your manual activation steps on awopdb02. If you started the database manually, stop it. If you mounted the disks manually, unmount them. Deactivate the VG.
2.) If the cluster services are not running on one or both nodes, start them: qdiskd, cman and rgmanager. If HA LVM-style configuration is used, you shouldn't need clvmd; but starting it too won't hurt anything. The fact that datavg1 is not activated should not prevent starting the cluster daemons.
3.) Make sure both cluster nodes are quorate and communicating with each other (see the "clustat" and "cman_tool status" checks above).
4.) If your database service is configured to start up automatically, rgmanager should start it: if not, use the "clusvcadm -e
If your cluster is properly configured, this should take care of the VG activation and all the necessary application start-up actions.
If you really need to override the cluster suite's control on VG activation, you should understand how the HA LVM configuration works, and then read the vgchange(8) man page, paying attention to the --addtag and --deltag options.
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2011 08:30 AM
тАО05-23-2011 08:30 AM
Re: vgchange activation error
Favourite Toy:
AMD Athlon II X6 1090T 6-core, 16GB RAM, 12TB ZFS RAIDZ-2 Storage. Linux Centos 5.6 running KVM Hypervisor. Virtual Machines: Ubuntu, Mint, Solaris 10, Windows 7 Professional, Windows XP Pro, Windows Server 2008R2, DOS 6.22, OpenFiler
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2011 08:36 AM
тАО05-23-2011 08:36 AM
Re: vgchange activation error
# A filter that tells LVM2 to only use a restricted set of devices.
# The filter consists of an array of regular expressions. These
# Don't have more than one filter line active at once: only one gets used.
filter = [ "a/.*/" ]
# filter = [ "r|/dev/cdrom|" ]
# filter = [ "a/loop/", "r/.*/" ]
# filter =[ "a|loop|", "r|/dev/hdc|", "a|/dev/ide|", "r|.*|" ]
# filter = [ "a|^/dev/hda8$|", "r/.*/" ]
# The results of the filtering are cached on disk to avoid
#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2011 12:13 PM
тАО05-23-2011 12:13 PM
Re: vgchange activation error
from:
volume_list = [ "VolGroup00", "@db01" ]
to:
volume_list = [ "VolGroup00", "@db01", "datavg1/lvol1" ]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2011 01:05 AM
тАО05-24-2011 01:05 AM
Re: vgchange activation error
You've now effectively disabled the HA LVM protection: datavg1 can now be activated on this node even if it has a tag that indicates it may currently be active on another node.
If db02 is currently running the service and db01 is rebooted, this change allows db01 to activate the datavg1 at boot time and perhaps perform an automatic filesystem check on datavg1/lvol1... while the filesystem is active on db02. This will *certainly* cause filesystem corruption, because db01's fsck will see db02's on-going operations as "corruption" and will attempt to fix it.
At that point, db02 will see problems like "WTF??? I just changed this directory entry from X to Y, but now it's back at X again?" This will typically cause the filesystem to become read-only at db02.
Let me emphasise: In a HA LVM configuration, it is important that the shared VGs *must not* be activated before the cluster services are started and communicating with the other node(s). The shared VGs *must not* be activated, filesystem-checked nor mounted by the regular start-up procedure: they must be controlled entirely by the cluster mechanisms.
If the shared filesystem is mentioned in /etc/fstab at all (you could omit it completely), it *must* have mount option "noauto" and the filesystem check pass number at the 6th column of fstab set to 0. Otherwise your system will fail to boot if the HA LVM locking mechanism works, or may corrupt your shared filesystem if the locking mechanism fails.
If your cluster configuration requires that the shared VG is activated on one or the other node before the cluster daemons are started, then your cluster configuration is misdesigned.
The correct procedure for manually activating a HA LVM -configured shared VG is like this:
(Note: this procedure is for emergency/maintenance use only. In normal use, the cluster should handle all this automatically - if it doesn't, your cluster may not be able to perform an automatic failover in a real failure situation.)
1.) Use "vgs -o +tags" to see if the VG currently has a tag on it.
2.) If the VG has no tag, or a tag that matches the name of the host you wish to activate the VG on, you can go directly to step 7.
3.) If the VG has a tag that matches the hostname of another node, *you must* first make sure that node does not have the VG currently activated.
4.) When you're sure the VG is not currently active on any node, you can use "vgchange --deltag" to remove the VG tag of the other node:
vgchange --deltag db02 datavg1
5.) At this point, say to yourself: "I am definitely certain this VG is not active on any cluster node, and I understand I will held responsible of any damages to data if this is not true." You're telling you know better than the cluster here.
6.) Then add a new tag that matches the hostname of the node you wish to activate the VG in:
vgchange --addtag db01 datavg1
7.) Activate the VG as normal.
vgchange -a y db01
8.) If applicable, run a filesystem check on the LV(s):
fsck -C0 /dev/mapper/datavg1-lvol1
9.) If applicable, mount the filesystem(s).
If the LV contains a raw database instead of a filesystem, steps 8 and 9 will not be applicable; instead, the database engine may be started at that point.
MK