Operating System - HP-UX
1847041 Members
4764 Online
110261 Solutions
New Discussion

Service Guard make_recovery lvmmtab

 
Marcus Höfner
Occasional Advisor

Service Guard make_recovery lvmmtab

Hi,

in a two node cluster i have created a recovery tape. At the moment of creation all package run on the other node. I make a recovery of the server. Everything runs fine, but then i want to start a package on the recovered node it fails. In the package log i found the following message:
vgchange: Volume group "/dev/vg03" does not exist in the "/etc/lvmtab" file.
ERROR: Function activate_volume_group
ERROR: Failed to activate vg03
strings /etc/lvmtab shows that only the disk of the vg00 are in the lvmtab.

I have searched in the forum and make some tests:
A simple vgscan -a -v does not work. The other introduced solution is a vgexport/vgimport procedure, well known from MC/ServiceGuard package creation. It is not easy for me to accept that this should be the only solution, because the correct device files are available under /dev/vgxx and also the files in the directory /etc/lvmconf (vgxx.conf, vgxx.mapfile)

Yes I am lazy, but on the other side it is easy to make a mistake during the vgexport/vgimport procedure.

Every hint is welcome

Thank you in advance
Marcus
4 REPLIES 4
Stephen Doud
Honored Contributor

Re: Service Guard make_recovery lvmmtab

yes Marcus, you are lazy - but so am I! Us lazy types are the most efficient, get the most done and our bosses keep us around :)

I can't vouch for the content of /etc/lvmtab in the make_tape_recovery files, so the easiest way to re-load lvmtab is to make and save map files for the cluster VGs in /etc/lvmconf.
Use the vgexport -vs options to put the VGID at the top of the file.

After the the ignite-rebuild and mkdir /dev/vgname; mknod /dev/vgname/group commands, use the vgimport -s -m options to get LVM to scan the backplane for all disks, and load those that have a matching VGID.

That should do it.
A. Clay Stephenson
Acclaimed Contributor

Re: Service Guard make_recovery lvmmtab

You are really expecting too much of Ignite especially in a SG environment. Consider the case when you are restoring to another machine; you may have different cards in different slots. All I ever want Ignite to do for me is to get vg00 back with all the patches and kernel tunings. From that point forward, if anything goes wrong I want to be responsible for it and not trust anyone else's code to perform properly. Also, before I ever want to start a package on a recovered node, I want to make sure that all my network connections are fully tested. You should also consider packages which share raw disks or filesystems between nodes. I definitely want full control over the imports in that case.

I'm lazy too. I make sure that I never need an Ignite image and I literally never have used one except during "play" or DR testing.
Amazingly, I have never even had a package failover other than those I command myself --- and this represents over 10 years of MC/SG (oops that's SG now) experience. During that same period, I have had tens of disk failures, NIC failures, power supply failures -- but none of those required a shutdown and all were fixed "on the fly" nor have I had an OS crash during that period.

Nevertheless, I still take weekly Ignite images (in addition to lifeboat disk images) so that I never need them.


If it ain't broke, I can fix that.
melvyn burnard
Honored Contributor

Re: Service Guard make_recovery lvmmtab

What version of Ignite are you using?
There is a known issue with older versions.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Marcus Höfner
Occasional Advisor

Re: Service Guard make_recovery lvmmtab

Thank you for the hints.
Here my remarks:

@Stephen: I have recovered the original lvmtab from my backup. Everyting works fine.

@A Clay Stephenson: You are right. Ignite has it limits and that is good. FYI the recovery was a test which my customer requests. You gave me a very importent hint. The standby network interface was configured from ignite. The result was that the standby functinality was out of order. The cmcheckconf abort.

Here my result for a disaster recovery of a cluster node.
If you create a recovery tape make a copy of the files /etc/lvmtab and /etc/rc.config.d/netconf (make not a copy in the rc.config.d). After the recovery you can use this files to restore the old configuration. (This statement is of course without gurantee)

I found a good thread for my network configuration problem:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1106455

I hold the thread open for a while. Please feel free to give additional hints. I think i have tested the cluster well, but ...

Thank you in advance
Marcus