Operating System - HP-UX
1833841 Members
2471 Online
110063 Solutions
New Discussion

Re: Service Guard cluster startup

 
Larry Barclay
Occasional Contributor

Service Guard cluster startup

Is it possible to configuration a 4 node SG cluster ( SG 11.14 ) such that : With 4 nodes starting from a power off state; If only 3 of the 4 nodes were able to boot, the cluster would still form.
We had a case where building power was lost - UPS didn't work, all servers crashed. When power was returned only 3 of the 4 servers came up but the cluster could not form.
7 REPLIES 7
Sanjay_6
Honored Contributor

Re: Service Guard cluster startup

Hi Larry,

you have to forcibly start the cluster in that sort of situation.

cmruncl -f -n node1 -n node2 -n node3.

Hope this helps.

Regds
Larry Barclay
Occasional Contributor

Re: Service Guard cluster startup

Thank you for your very quick response!

Do you have to wait for the cluster to fail forming before you can issue this command ?
Sanjay_6
Honored Contributor

Re: Service Guard cluster startup

Hi,

If you have configured the cluster to start automatically during system startup, i.e if you have the variable AUTOSTART_CMCLD set to 1 in the file /etc/rc.config.d/cmcluster on all servers then you have to wait for some time, say 10 minutes before the automatic cluster startup fails in your situation. Most of the time what i do is I login on the servers, kill the process "cmcld" and "cmclconfd" ( I forgot which process are there, but i kill all SG processes) and then manually force the cluster to start with less number of nodes than the original cluster using the command syntax i gave before.

Hope this helps.

Regds
melvyn burnard
Honored Contributor

Re: Service Guard cluster startup

The design criteria for SG is htat 100% of nodes MUST be available to allow a cluster to start automatically.
The default time they will try to continue to do a formation os set by the AUTO_START_TIMEOUT parameter in thre cluster ascii file, and is default 10 minutes.
There is NO way to allow just three nodes to autostart if the cluster expects 4 nodes.
You will have to wait until the above timeout has expired BEFORE you can issue the manual overide command of:
cmruncl -v -n node1 -n node2 -n node3
You may want to read the Managing MC/ServiceGuard manual at
http://docs.hp.com/hpux/ha
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Stephen Doud
Honored Contributor

Re: Service Guard cluster startup

When configured to autoform a cluster at boot time, the server executes a "cmrunnode" command. That cmrunnode will expire in 10 minutes unless it's killed. Kill it on all nodes running it, and then they can respond to a 'cmruncl' command for specific nodes.

Custom programming would be necessary to cause the right set of nodes to form a reduced cluster.
-S.
John Poff
Honored Contributor

Re: Service Guard cluster startup

Hi,

Let's see, you COULD write a script that checks for the presence of the other servers at boot time, and then have it build the cluster and packages from scratch, and then bring up the cluster. That would be kind of fun. A self creating cluster! Old number four doesn't show up? No problem. We just won't invite him to our new cluster. :)

Probably your instances of losing a server are so rare that you really don't need to worry about it. Sometimes the hardware just fails and there isn't much you can do about it, except to call HP and keep your maintenance at the right level. We've run MC/SG for several years now, and doing the 'cmruncl -n ...' has been the way out of similar problems for the very few times it has happened.

JP

P.S. You HP guys probably either cringe or laugh when you read my posts. :)

Michael Steele_2
Honored Contributor

Re: Service Guard cluster startup

You that was interesting Melvyn I didn't know that about MC/SG.

Larry, there is a Failover policy that sounds similar to this startup delimma.

In a four node cluster you can list out the order of failing over in the package configuration file or you can choose to have the cluster sense which node has the lightest load / fewest packages and fail to. The latter is called MIN_PACKAGE_NODE and the former is called CONFIGURED_NODE and is the default.
Support Fatherhood - Stop Family Law