Operating System - HP-UX
1836962 Members
2594 Online
110111 Solutions
New Discussion

node didnt automatically join cluster

 
Richard Woolley
Frequent Advisor

node didnt automatically join cluster

yesterday a node in the cluster was rebooted however it did not appear to come automatically back into the cluster.

below is so far the only clue and is taken from the syslog

saturn cmclconfd[7152]: Permission denied for user root on node saturn (/etc/cmcl
uster/cmclnodelist)

an output of cmclnodelist is

-rwxrwxr-x 1 root sys 219 Feb 19 2002 /etc/cmcluster/cmclnodelist

saturn root is an entry in this file.

any ideas where else i could look?
13 REPLIES 13
melvyn burnard
Honored Contributor

Re: node didnt automatically join cluster

and on the other cluster nodes?
has it joined the cluster yet?
what version of OS and SG, and what patch is on for SG?

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Richard Woolley
Frequent Advisor

Re: node didnt automatically join cluster

it joined the cluster immediately after a cmrunode was manually executed. So joining wasnt the problem. Just did not do it automatically and there is a 1 set in the /etc/rc.config.d/cmcluster

what should i look for on the other nodes?
T G Manikandan
Honored Contributor

Re: node didnt automatically join cluster

T G Manikandan
Honored Contributor

Re: node didnt automatically join cluster

Richard Woolley
Frequent Advisor

Re: node didnt automatically join cluster

O/S version 11.00
S/G version 11.13
PHSS_25915 1.0 MC/ServiceGuard and SG-OPS Edition A.11.13
PHSS_26928 1.0 MC/ServiceGuard and SG-OPS Edition A.11.13

have reviewed docs and checked what they said is correct on the machine.

cmclnodelist exists as mentioned earlier with itself in. So cannot understand why that error message appears.

saturn root - is an entry in this file.

Kent Ostby
Honored Contributor

Re: node didnt automatically join cluster

Mark --

Can you post the complete entries in the cmclnodelist files.

Also, please post the output of "hostname".

Also, which machine did you run the "cmrunnode" command from ?

If you ran it from a machine other then "saturn", please post that cmclnodelist as well.

Finally please post what you are using for name resolution as that can play a part in a "permission denied" situation.

"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Richard Woolley
Frequent Advisor

Re: node didnt automatically join cluster

i have attached the cmclnodelist!

output of hostname is "saturn"

the cmrunnode command was run from the node itself

name resolution is determined by the /etc/hosts file.

cheers.
monasingh_1
Trusted Contributor

Re: node didnt automatically join cluster

do you have AUTOSTART_CMCLD=1 in /etc/rc.config.d/cmcluster ?.

This may be a culprit..
hope this works..
Jeff Schussele
Honored Contributor

Re: node didnt automatically join cluster

Hi Mark,

Monasingh has probably pegged it. IF you don't have
AUTOSTART_CMCLD=1 in /etc/rc.config.d/cmcluster
then the node will NOT auto start the cmcl daemons & hence won't join the cluster until cmrunnode is executed.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Richard Woolley
Frequent Advisor

Re: node didnt automatically join cluster

AUTOSTART_CMCLD=1

is what is in the /etc/rc.config.d/cmcluster file :(
Kent Ostby
Honored Contributor

Re: node didnt automatically join cluster

Here's a few other questions.

#1) Were there other nodes in the cluster that were rebooting at the same time or was this the only one ?

Since CMRUNNODE joins an existing cluster, if all of your nodes are down, then CMRUNNODE may not cause the node to join unless there multiple nodes are running cmrunnodes at the same time.

#2) Given the "permission denied" error, it leads back to the question of hostname resolution.

So please post the output of:

grep saturn /etc/hosts
and
nslookup saturn

Also, from the OTHER nodes in the cluster, please do:

grep saturn /etc/cmnodelist

To compare to the cmnodelist on this system.
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Richard Woolley
Frequent Advisor

Re: node didnt automatically join cluster

1. afraid not, this is a 5 node cluster and this was the only 1 booting at the time.
(though just found out from hp predict the machine crashed yesterday with "init spawning too rapidly" - also another forum thread).

here's the info requested

Stephen Doud
Honored Contributor

Re: node didnt automatically join cluster

Hi Mark,

Having dealt with a similar failure-to-join-cluster case some months ago, the resolution took patience to discover through experimentation. In that case, Tivoli was altering /etc/syslog.conf and having started just before /sbin/init.d/cmcluster, it held off cmclconfd long enough to prevent it from working.

In that situation, we closely inspected the boot-time messages in /etc/rc.log and the console, looking for abnormal termination or error messages.

Since the error condition in this issue is "permission denied", we have to assume that at the time the cluster is performing cmrunnode, either the local host, or the cluster manager node is not performing hostname resolution properly or not permitting root to operate at that time. What is "hosts" set to in /etc/nsswitch.conf

Does the condition repeat itself?
Are there any clues in syslog.log?
Does inetd start properly (syslog.log)?

I suggest you open a case with the response center to continue to narrow down the root cause. As I said before, the cause is not obvious.

-S.