cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster formation upon reboot

 
SOLVED
Go to solution
Highlighted
Shahril M
Frequent Advisor

Cluster formation upon reboot

I have a 3-node cluster, of 2 N-class and 1 L-class.

AUTOSTART_CMCLD=1 for all nodes

If I were to reboot all 3 nodes together, the L would come up first.

Would there be any problems in cluster formation? I have heard that all 3 nodes must be up in order for cluster to form automatically. Is this true?

TIA.
Shahril

17 REPLIES 17
John Poff
Honored Contributor

Re: Cluster formation upon reboot

Hi,

If your L box boots first, the cluster daemons will start up and start trying to form a cluster. They will keep trying for a certain length of time before they give up. I don't remember what the default length of time is, but it seems like it is about 30 minutes or so. We've run two node and three node clusters before and booted them all at the same time, and I remember that the first node up will wait around for the other nodes to show up.

If you run into problems with one node coming up much faster than the others, you could manually start the cluster on just one node if you needed to, and then the other nodes would eventually join when they came up.

JP
Jeff Schussele
Honored Contributor

Re: Cluster formation upon reboot

Hi Shahril,

No problems.
The cluster will initially form with just the L & then as the Ns come up they'll join the cluster.
Depending on how the pkgs are defined as to where they can run will determine what pkg(s) start on the L. So keep that in mind when you define the pkgs.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
John Poff
Honored Contributor

Re: Cluster formation upon reboot

Hi again,

It looks like the default time is ten minutes. I thought it was thirty minutes, but I guess everything seems longer when you are waiting for machines to boot! :)

Here is the URL to the doc I found and a snippet that talks about the time value for cluster startup:

http://docs.hp.com/hpux/onlinedocs/B3936-90053/B3936-90053.html

Autostart Delay


The amount of time a node waits before it stops trying to join a cluster during automatic cluster startup. In the ASCII cluster configuration file, this parameter is AUTO_START_TIMEOUT. All nodes wait this amount of time for other nodes to begin startup before the cluster completes the operation. The time should be selected based on the slowest boot time in the cluster. Enter a value equal to the boot time of the slowest booting node minus the boot time of the fastest booting node plus 600 seconds (ten minutes).

Default is 600,000,000 microseconds in the ASCII file (600 seconds in SAM).


JP

Shahril M
Frequent Advisor

Re: Cluster formation upon reboot

Hi,

The L came up first as expected, but could not form the cluster. Here's the text from its syslog:

cmcld: Cluster formation failed
cmcld: Reason: Ran out of time for automatically joining a cluster
cmcld: Unable to contact all nodes in the cluster, thus it is not
cmcld: possible to join the cluster at this time
cmsrvassistd[1492]: Lost connection to the cluster daemon

Why wouldn't the L form a single-node cluster? Is there a configuration setting that prevents that from happening?

Incidentally, the last N to come up still could not form the cluster. cmviewcl indicated that this N was "starting" and trying to form the cluster, but also showed that the L and the other N was "down". Is this also a reason why the second N could not form the cluster?


Rgds,
Shahril
Elif Gius
Valued Contributor

Re: Cluster formation upon reboot

The node need to know if he is allowd to form a new cluster...if you have three nodes and all are in the status of "starting", they don't know from each other if the daemons are up and running... therefor you get the error
+++++
cmcld: Unable to contact all nodes in the cluster, thus it is not
cmcld: possible to join the cluster at this time
cmsrvassistd[1492]: Lost connection to the cluster daemon
+++++
When starting a cluster there must be >50% of the cluster reachable... if not the node has not the majority and cannot form a cluster
Elif Gius
Valued Contributor

Re: Cluster formation upon reboot

At Reboot /sbin/init.d/cmcluster will be executed.
All nodes must be present and either attempting to form a cluster, or they must already are in a cluster, to allow a booting node to form/join a cluster. Your log tells that after a time value the cluster formation failed - ran out of time:

cmcld: Reason: Ran out of time for automatically joining a cluster

If you want that one node can form a new cluster use this command:
++++
# cmruncl -n -n ...

This is useful if one of the nodes is not able to join the cluster at the moment (e.g. it takes more time for booting)

melvyn burnard
Honored Contributor
Solution

Re: Cluster formation upon reboot

All nodes in a cluster MUST be available to autostart the cluster in htis case.
The default timeout is 10 minutes as defined by AUTO_START_TIMEOUT

If you suspect one node might take longer than average to boot up, then htis parameter needs to be increased, but is normally sufficient for most cases.

If one node attempts to start the cluster, and by the time the 10 minute timeout interval is reached there is NOT 100% availability of all nodes to start the cluster i.e. at the point where they run the /sbin.init.d/cmcluster script, then the node that hits the 10 minute timeout will CEASE attempting to start the cluster. This then means that once the rest of the nodes get to the point of attempting to start the cluster, they will also sit trying to form a cluster, but will not get 100%m quorum
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Christopher McCray_1
Honored Contributor

Re: Cluster formation upon reboot

Hello,

Read this doc:

http://www1.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000062686678

This should explain all you need or want to know on the subject. It does beg the question, though, as to why you would have wanted to do this anyway? You have a 3-node cluster and you wanted to boo them all simultaneously? You must have had some really odd situation because the idea behind service guard is that you can maintain your applications in an up state in a cluster while doing maintenance on other nodes in that cluster.

Did the L come up first? If so, how much faster than the Ns? Do you have the L set up to disregard all the system checks before booting?

The major point is, unless you are faced with impending disaster, you shouldn't boot every node in the cluster at the same time.

Hope this helps

Chris
It wasn't me!!!!
Oleg Zieaev_1
Regular Advisor

Re: Cluster formation upon reboot

While your cluster is up and running,
increase value in cmclconf.ascii file set by
AUTO_START_TIMEOUT. (please note time set is in microseconds).
Make sure it is set the same on all three node.
Run cmapplyconf.
Test it once you have your outage window.

Hope this helps.
Professionals will prevail ...