1833042 Members
2294 Online
110049 Solutions
New Discussion

Re: vg synchronize

 
Slawomir Gora
Honored Contributor

vg synchronize

I have such situation:
1. vg with two disks
2. I have lost link to one disk - vg is not active
3. I activate vg with no quorum
vgchange -a y -q n vgtest
4. vgdisplay vgtest shows me:
couldn't query physical volume
it is true because link to disk is down -
but I can work - vg is active.
5. Now - my link to disk commes back - is up
6. I have tried to do vgsync to synchronize volumens - but I have got I/O error
7. when I deactivate and activate it again
- everything is OK but it takes to much
time for synchronize. During sych.. I cant
use vg.

Question:
Is it possible to synchronize volume group,
which is activated with noquorum (one disk
is unreachable) ?
15 REPLIES 15
Elmar P. Kolkman
Honored Contributor

Re: vg synchronize

It is possible. Make sure the kernel sees the disks again by looking at the output of 'ioscan -fnkC disk' to make sure the disk in question is claimed. If not, do a ioscan without the 'k' option, and see if it claimed than.

After that a vgsync should work. If not, do a vgcfgrestore or even a lvreduce to remove the mirroring and then add the mirroring again.

Hope you can do something usefull with this answer...
Every problem has at least one solution. Only some solutions are harder to find.
Bill Hassell
Honored Contributor

Re: vg synchronize

The key is the words: I/O error. There is something wrong with one of the disks in your volume group. Very likely your log file /var/adm/syslog/syslog.log is full of error messages. You'll need to remove the disk from the volume group and replace it.


Bill Hassell, sysadmin
Bernhard Mueller
Honored Contributor

Re: vg synchronize

Slawomir,

when you activate a vg using "-q n" and later on you re-establish a link to a missing disk
you should just enter
vgchange -a y vgname
this brings in the missing disks/links
then you enter
vgsync vgname
syncing the vg while it remains active.
the sync itself is always time-consuming.

Regards,
Bernhard
Rita C Workman
Honored Contributor

Re: vg synchronize

After you replaced the bad disk (of the mirrored pair) did you remember to run the vgcfgrestore. Maybe these will help:

vgcfgrestore -n /dev/vgname /dev/rdsk/c-t-d-
vgchange -a y /dev/vgname
vgsync /dev/vgname

Or if your volume group is already active you might:

vgcfgrestore -R -n /dev/vgname /dev/rdsk/c-t-d-
vgsync /dev/vgname

Regards,
Rita
Slawomir Gora
Honored Contributor

Re: vg synchronize


I have tried ioscan -fnC disk
but there is no changes in vg behavior.
Bernhard Mueller
Honored Contributor

Re: vg synchronize

It is possible that you see the disk claimed in ioscan and have no errors in the syslog file but it may make trouble anyway.

Try running
dd if=/dev/rdsk/cXtYd0 of=/dev/null bs=1024k

Guess at some time
dd will bail out with I/O error

1. replace disk
2. use Rita's advice to proceed

Regards,
Bernhard
Slawomir Gora
Honored Contributor

Re: vg synchronize

Realy my problem looks more difficult:
1. I have MC/SG cluster (continental)
with two va7110 - distannce
between servers and va7110 is 600m,
luns are mirrored between va7110
2. I simulate catastrof - one center is powered-off
3. My aplication package is moved to second server, activate vg group but one va7110 is
powered off - activation vg is done with noquorum option (one va7110 is unreachable)
4. Now I turn on first data center,
but without deactivatin and after that
activatin vg group volumens can't synchronize !!!
Bernhard Mueller
Honored Contributor

Re: vg synchronize

Slawomir,

I understand that the failover works, and the package runs in the second data center.

Then you power on everything in the first data center again. What do you do next?

In my opinon you would do the following on the node in the second data center where the package is running:
ioscan -fn
# all disks CLAIMED?
vgchange -a e vgXYZ
# re-activate all cluster vgs with all disks
vgsync vgXYZ

Only after the vgs are sync'd you're safe you can switch back to data center one.
Did you do it this way, or anything different?
Slawomir Gora
Honored Contributor

Re: vg synchronize

Dear Bernhard,

After failover package to second center
my package vg group has access only to
one va7110 - in vg status there is no avaiable disks from powered off center (mirror copies).

When I power on first center - I have access from second center to luns in first center,
all works, ioscan, diskinfo - everithing is OK, but vg in active package doesn't synchronize mirror copies. When I do:
cmhaltpkg ...
cmrunpkg ...
synchronization works after vg activation
(vgchange -a e -q n -s)

My problem is how to synchronize without
package restarting.
Sanjay_6
Honored Contributor

Re: vg synchronize

Hi,

If you have activated the VG without quorum, you don't have to deactivate and reactivate the vg once you have fixed the quorum problem.

Once you have fixed the quorum problem, just activate the vg again manually without issueing the -q n option.

vgchange -a y -q n /dev/vg_name

fix the disk prob.

activate the vg again,

vgchange -a y /dev/vg_name

now do a vgsync.

vgsync /dev/vg_name

Hope this helps.

Regds
Sanjay_6
Honored Contributor

Re: vg synchronize

Hi,

you use the "vgchange -a e /dev/vg_name" in the above example. then do a vgsync. you do this on the node where the vg was activated in non-quorum mode.

Hope this helps.

Regds
Pramod_4
Trusted Contributor

Re: vg synchronize

Once your disk is back online, if you do a vgchange -a e it will update the current VG status and will allow you to do a vgsync.

Good Luck.
Hemanth Gurunath Basrur
Honored Contributor

Re: vg synchronize

Bernhard Mueller
Honored Contributor

Re: vg synchronize

Hello Slavomir,

Sanjay and Pramod have stressed what I was trying to tell you:
After the failover DO NOT HALT THE PACKAGE.

Instead you should re-gain the links to disks missing at intitil vg activation (by package control script) by doing a
vgchange -a e
on all vgs in your package WHILE the PACKAGE is UP. Then
vgsync vgxyz

ONLY AFTER your mirrors in the previuosly failed data center are back in sync you can safely bring down and re-start the package.

This is my personal checklist whenever a failover in an MC/SG package occurs:
- make sure you know exactly what happened
- fix the fault
- make sure EVERYTHING is as it is supposed to be (no other fault)
- think twice before you do anything else
- think about it a third time
- try halting and starting on the same node
- try switching to an adoptive node

Regards,
Bernhard
Chris Watson
Super Advisor

Re: vg synchronize

I have the same setup as Salowmir, except they are va7410.

Yesterday I depowered one of the VA's (deliberately), and the L3000 in the same rack (which was running our only package), and thus simulating a building power failure (the failover node and other va7410 being in another building).

When the package came up on the other node (automatically), the package came up clean (I had set 'no quorum on vgchange'). I thought 'this okay'.

THEN, I brought back the depowered va, and l3000 (the l3000 is not further relevant to this problem) and a short time later an ioscan on my failover node running the the active package showed the links were available. SO, I shutdown the package, and started it again. It was a painful 80 mins until the package came backup up.

The question is, if I read the man page for vgchange correctly, that if I use the s switch 'no resync' it will be ignored due to NOMWC being relevant to this case.

Want I really want to happen is that on vgchange it becomes active immediately for use, and stale syncing goes on the background. I don't want certain scenarios putting an hours delay into package startup!

Further, the syncing that did occur in the foreground of package startup seemed to involve almost as much time as the creation of a mirror copy would, even though there were only a maximum of 10MB's of data being change during the depower!

I note Bernhards comments of DON'T STOP THE PACKAGE.

Finally, if I lose sitewide power, the va's will go down at diffent times, due to differnt ups capacities. Thus, when node come back up, its seems I will have to go through all this again.

Basically, the problem is that I WILL have quorum if the package is restarted, and will thus get a foreground resync, thus hours of waiting.

HELP!

PS, Can I get my cluster nodes to tell the va's to shutdown if the ups sends a signal to the node?

Moving along nicely