Operating System - HP-UX
1748084 Members
5087 Online
108758 Solutions
New Discussion юеВ

Node not joining to the cluster

 
Ganesan R
Honored Contributor

Node not joining to the cluster

Hi,

We have a two node cluster. One node went down due to some hardware issue. After we booted the node I am unable to join the cluster. if I run #cmrunnode -v command on failed node getting the below error.

cmrunnode : Unable to determine the nodes on the current cluster
cmrunnode : Either no cluster configuration file exists, or the file is corrupted, or /usr/lbin/cmclconfd is unable to run
Failed to lookup /cluster in configuration database.
Fail to load data from configuration database.

If i run the same command on working node it is able join the cluster. I don't know why it is not joining if i run the command on failed node. I copied the binary cluster config file from working node to failed node as well.

Can someone help on this?
Best wishes,

Ganesh.
8 REPLIES 8
melvyn burnard
Honored Contributor

Re: Node not joining to the cluster

what version of Sg, and what SG patch do you have installed?
Use:
what /usr/lbin/cmcld
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Ganesan R
Honored Contributor

Re: Node not joining to the cluster

Service Guard version is A.10.11 running on HP-UX 10.20.

# what /usr/lbin/cmcld
/usr/lbin/cmcld:
HP92453-02A.10.00 HP-UX SYMBOLIC DEBUGGER (END.O) $Revision: 74.03 $
Build date: Tue Oct 20 14:07:41 PDT 1998
A.10.11 $Revision: 80.465.1.5.1.25 $ $Date: 98/10/20 10:37:44 $
Daemon $Revision: 80.15.1.2 $
Cluster Monitor $Revision: 80.69.1.3 $
Command Srv $Revision: 80.5.1.1 $
Package Manager $Revision: 80.15.1.21 $
CommunicationSrv$Revision: 80.1 $
Network Sensor $Revision: 80.19.1.4 $
Remote Comm $Revision: 80.76.1.2 $
Local Comm $Revision: 80.20.1.1 $
Util $Revision: 80.34.1.1 $
Status DB $Revision: 80.4 $
API $Revision: 80.5 $
Config $Revision: 80.241.1.38 $
Config DB $Revision: 80.90.1.5 $
Service Sensor $Revision: 80.10.1.5 $
Sync $Revision: 80.1 $
Dlm $Revision: 80.5 $
Cluster LVM $Revision: 80.16.1.8 $

Best wishes,

Ganesh.
Aneesh Mohan
Honored Contributor

Re: Node not joining to the cluster

Hi Ganesan,

Can you check the permission of the binary file first .

Please check and reply the below steps/cmds too

a) errors in the syslog

b) #cmgetconf

c) #cmviewconf

d) Service guard version.

Thanks,
Aneesh


melvyn burnard
Honored Contributor

Re: Node not joining to the cluster

First of all, all support for Serviceguard on HP-UX 10.20 ended on June 30, 2003.
Secondly, you appear not to have a patched SG
You should use swlist just to make 100% sure there is no SG patch installed (althought the what string shows none) and then install PHSS_19473 on both nodes.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Deepak Kr
Respected Contributor

Re: Node not joining to the cluster

Hi,

run following and provide output:

cmscancl -n node1 -n node2 -o /tmp/cmscancl.out



Also run cksum on both cmclconfig files on both nodes

"There is always some scope for improvement"
Ganesan R
Honored Contributor

Re: Node not joining to the cluster

Hi Kumar,

cksum for cluster binary file on both node is same.

Working node :
==============
#cksum cmclconfig
340177250 6940 cmclconfig

Alt node:
=========
#cksum cmclconfig
340177250 6940 cmclconfig

What I could suspect is cmclconfd daemon is not running. It supposed to be started by inetd daemon right?

I tried started manually and restarted inetd daemon as well. but no success.

How do we start it ?
Best wishes,

Ganesh.
melvyn burnard
Honored Contributor

Re: Node not joining to the cluster

you dont!
Serviceguard uses that and fires it up when a request gets sent, and the inetd picks up the connection request, then opens teh socket for the cmclconfd request.

You are on a VERY old SG version, as wel as an unsupported OS version.

You could try halting teh cluster and restarting it, but there may be a few other things to try first,
on the node that does not join, and if ti is NOT runing cmcld, check to see if ANY cm* process are running.
if they are, kill them and then try to get the node to rejoin the cluster.

But I woul dstrongly advise you at least patch SG.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Deepak Kr
Respected Contributor

Re: Node not joining to the cluster

It may possible that cmclconfig is corrupted...on this node..have you tried running cksum on the cmclconfig file that was there before you copied another from second node.

I fully agree with burnard here...

Is it possible for you to shutdown cluster here??

you can get running config using

On running node:
#cmgetconf -v -c clustername runningconfig.out

provide that...
"There is always some scope for improvement"