1839142 Members
3152 Online
110136 Solutions
New Discussion

Re: Testing ServiceGuard

 
SOLVED
Go to solution
Ronald Schwartz_1
Frequent Advisor

Testing ServiceGuard

I need to put together the test plan for ServiceGuard. What I need to do is failover from primary to alternate and back again. I think that I can do a cmhaltnode -f primNode to cause the failover to the alternate but how to I bring it back to the primary?
Thanks
7 REPLIES 7
Chris Watkins_1
Respected Contributor
Solution

Re: Testing ServiceGuard

I like more realistic testing, if you can get away with it.
(at most places, I'd suspect you can't. I usually can't)
Pull both lan cables... toc the box, etc...

Anyhow... To get back to the original node:

# cmhaltpkg pkgname
# cmrunnode nodename
# cmmodpkg -e pkgname
Not without 2 backups and an Ignite image!
sinhass
Regular Advisor

Re: Testing ServiceGuard

After that --->run cmrunnode primnode --> then cmhaltnode -f altnode ->cmrunnode altnode

-sinhass
A. Clay Stephenson
Acclaimed Contributor

Re: Testing ServiceGuard

After restarting the node via cmrunnode. You do a cmhaltpkg and then a cmrunpkg specifying the desired node. By the way, I would not pursue schemes of automatic failback because usually packages fail for complicated reasons and most packages should only be moved at non-critical times. I would always use manual methods to move a failed package back to its primary node.

Cmhaltnode is a good command to start testing but you should be much more aggressive later in your test plan -- e.g. yank the power cord(s), yank disks, network cables, remove SCSI cables, --- and no, I am not joking.
If it ain't broke, I can fix that.
Sridhar Bhaskarla
Honored Contributor

Re: Testing ServiceGuard

Hi,

If you want to test the serviceguard functionality, then you will need to a lot more testing than simple cmhaltnode -f. cmhaltnode -f will bring down the package and brings it on the other node (if configured) and shuts down the local cluster daemon. I would do the following to test the failover.

First, make sure the package runs fine on both the nodes. Then you are almost 75%. Rest is the simulation of system failures.

cmhaltpkg pkg_name
cmrunpkg -v -n other_node pkg_name


If it runs, then I would test various scenarios like

1. Shutdown the server while the cluster is running
2. Toc the server while the cluster is running
3. Pull out the user data network cables
4. Pull out the heartbeat network cables
5. Kill the application processes if you are monitoring the processes through serviceguard
6. Create problems on the server yourself. like consuming the swap space, overflowing the kernel parameters etc.,. Your package is intelligent, then it should pick up all the issues.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Chris Watkins_1
Respected Contributor

Re: Testing ServiceGuard

Doh.
No points on this one please. No modpkg needed after the cmrunnode.

The AUTO_RUN will already be enabled after the cmrunnode.

Not that the modpkg will cause any problems.
It will just tell you that switching was already enabled.


sidenote... AARRRRGGGHHH!
I am having to reply sometimes three, four times today,
to get one to go through successfully. Just had to vent. Carry on :-)
Not without 2 backups and an Ignite image!
Uday_S_Ankolekar
Honored Contributor

Re: Testing ServiceGuard

Ronald Schwartz_1
Frequent Advisor

Re: Testing ServiceGuard

Thanks you answered my question, although I was hoping that the package could be switched back without the halt. I am planning a more aggressive test later.
Thanks again everyone.