Operating System - HP-UX
1833991 Members
3952 Online
110063 Solutions
New Discussion

Re: Serviceguard - package doesn't failover when primary box is lost.

 
SOLVED
Go to solution
Ray Humpage
Frequent Advisor

Serviceguard - package doesn't failover when primary box is lost.

I have a new 2 node serviceguard installation. I can manually stop the package on the production box and manually start it on the failover box. That all works fine. But when I physically turn off the production box - the package does not migrate to the other box. I think I have all of the correct switches enabled etc. Any ideas?
9 REPLIES 9
Patrick Wallek
Honored Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

What does your 'cmviewcl -v' output look like when the package is running?

It sounds as if switching is not enabled for the package.
Mel Burslan
Honored Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

I second Patrick's opinion.

cmmodpkg -e your_pkg_name

should take care of that problem if it is the case.
________________________________
UNIX because I majored in cryptology...
Ray Humpage
Frequent Advisor

Re: Serviceguard - package doesn't failover when primary box is lost.

I've attached my cmviewcl -v.

Auto_run shows as disabled - but it's set as yes in my config file.

When I do the cmmodpkg on all the nodes it does switch to enabled so I think I'll be ok as long as I remember to do that everytime I do anything to the cluster.

Alan Meyer_4
Respected Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

The fact that it auto_run is configured to on in the conf cile and displays as off in the cmviewcl leads me to believe that cmapplyconf was not run after the last edit of the conf file.
" I may not be certified, but I am certifiable... "
Mel Burslan
Honored Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

Ray,

The default behavior of the package, to prevent bouncing back and forth between two or more nodes, is to manually enable the package after the first failover. At this point if you run

cmmodpkg -e cadencepkg

command, you will see that it will change the AUTO_RUN to enabled. If you are not worried about package bouncing between two nodes, and keep the auto_run enabled all the time, you need to set the FAILBACK policy to AUTOMATIC from its current MANUAL value in the cadencepkg.conf file, but I strongly discourage this. If a package failed from its original home node to an adoptive node, it should not be blindly enabled to switch back before the reason for this failover has been determined and corrective actions has been taken by the sysadmin.

hope this helps.
________________________________
UNIX because I majored in cryptology...
Devesh Pant_1
Esteemed Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

Ray,
I think keeping it the way it is is the best thing. manually enabling the failover is the best thing to do. This will prevent bouncing of package between crashing nodes in case that happens.

thanks
DP
Sameer_Nirmal
Honored Contributor
Solution

Re: Serviceguard - package doesn't failover when primary box is lost.

The reason for failure of package migration to the other box is parameter AUTO_RUN is set as Disabled. The AUTO_RUN parameter should be enabled in cluster to have package switching in case of failure of a node. At the same time it signifies to start the package automatically on a node when the cluster starts. It is better to have the failback policy as manual.

It seems that the parameter is set in package ascii file but not applied at cluster level. Run cmapplyconf -P to apply package cnfiguration to the cluster which would enable AUTO_RUN.

(The attached document shows the package monitoring it turned off.)

If you enable manually package switching using cmmodpkg just before shutdown of the node, it should failover to the another node.
If you want it automatic ( cluster to do it ) , set AUTO_RUN to enabled.
Stephen Doud
Honored Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

As others have stated, the package AUTO_RUN value is DISABLED, preventing the package from failing over to ANY other node.

AUTO_RUN is usually initially ENABLED (set to YES in the package configuration file). However this value can be set to no manually or by an unexpected package start/stop failure. It's always best to inspect both AUTO_RUN and Node_switching values after unexpected package failures, to help decide whether these switches need to be re-enabled.

SIDEBAR:
cmmodpkg has 3 usage forms:

1) Enable/Disable AUTO_RUN.
Think of this as a master circuit breaker, enabling or disabling the package from being moved to ANY other node.
Form: cmmodpkg -e/d package_name

2) Enable/Disable Node_Switching.
Think of this as a per-server circuit breaker, enabling or disabling a specific node to run the package

Form: cmmodpkg -e/d -n package_name

3) Resetting restart values for package services
Form: cmmodpkg [-v] -R -s service_name package_name
Mohanasundaram_1
Honored Contributor

Re: Serviceguard - package doesn't failover when primary box is lost.

Hi Ray,

what kind of messages do you see in the syslog.log file, when you switch off the primary box?

Are you monitoring any subnet? only if you monitor a subnet, will the package switch upon its failure.

So, post the syslog.log file, package conf file and package control file.

With regards,
mohan.
Attitude, Not aptitude, determines your altitude