1837121 Members
11223 Online
110112 Solutions
New Discussion

cmcld message in syslog

 
kholikt
Super Advisor

cmcld message in syslog

Hi,

Any idea I get the following message quite frequent usually in the 3 - 4 days time. Is this due to high CPU usage or something else.

May 15 23:47:04 node2 cmcld: Attempting to adjust cluster membership
May 15 23:47:04 node2 cmcld: Warning: cmcld process was unable to run for the last 8 seconds,
May 15 23:47:04 node2 cmcld: which is longer than the node timeout (8 seconds)
May 15 23:47:04 node2 inetd[15365]: registrar/tcp: Connection from node2.esc (10.30.11.204) at Sun May 15 23:47:04 200
5
May 15 23:47:04 node2 cmcld: Obtaining Cluster Lock
May 15 23:47:05 node2 cmclconfd[2650]: Updated file /var/adm/cmcluster/frdump.cmcld.8 for node node2 (length = 512096)
.
May 15 23:47:05 node2 cmcld: Turning off safety time protection since the cluster
May 15 23:47:05 node2 cmcld: may now consist of a single node. If ServiceGuard
May 15 23:47:05 node2 cmcld: fails, this node will not automatically halt
May 15 23:47:08 node2 cmclconfd[2650]: Updated file /var/adm/cmcluster/frdump.cmcld.9 for node node2 (length = 6827).
May 15 23:47:09 node2 cmcld: Enabling safety time protection
May 15 23:47:09 node2 cmcld: Attempting to adjust cluster membership
May 15 23:47:09 node2 cmcld: Clearing Cluster Lock
May 15 23:47:09 node2 cmcld: Resumed updating safety time
May 15 23:47:10 node2 cmcld: 2 nodes have formed a new cluster, sequence #15
May 15 23:47:10 node2 cmcld: The new active cluster membership is: node2(id=2), node1(id=1)
abc
5 REPLIES 5
Steven E. Protter
Exalted Contributor

Re: cmcld message in syslog

Heartbeat seems to have been lost in your cluster.

The message indicates that the second node, node2 has dropped out of the cluster.

Possible causes:
* Heartbeat lost due to network congestion
* Network problem with cables, switch or hub.
* Software or hardware problem on node 2 causing it to crash or otherwise go offline.

I recommend looking for causes and log entries on node 2.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Hoang Chi Cong_1
Honored Contributor

Re: cmcld message in syslog

Hello,

These message means that the node2 in your cluster had some trouble.
"May 15 23:47:04 node2 cmcld: Attempting to adjust cluster membership
May 15 23:47:04 node2 cmcld: Warning: cmcld process was unable to run for the last 8 seconds,
May 15 23:47:04 node2 cmcld: which is longer than the node timeout (8 seconds)
May 15 23:47:04 node2 inetd[15365]: registrar/tcp: Connection from node2.esc (10.30.11.204) at Sun May 15 23:47:04 200
5
May 15 23:47:04 node2 cmcld: Obtaining Cluster Lock
May 15 23:47:05 node2 cmclconfd[2650]: Updated file /var/adm/cmcluster/frdump.cmcld.8 for node node2 (length = 512096)
.
May 15 23:47:05 node2 cmcld: Turning off safety time protection since the cluster
May 15 23:47:05 node2 cmcld: may now consist of a single node. If ServiceGuard
May 15 23:47:05 node2 cmcld: fails, this node will not automatically halt
May 15 23:47:08 node2 cmclconfd[2650]: Updated "
For these message shown that the node2 was out of the cluster. And all of the last message mean that node2 was joined to the cluster.
This error does not belong to high CPU usage.
As SEP recommend, please check for the heartbeat connection in node2.
In general, the possible cause by network trouble or the node2 goes down and up.

Regard,
Hoang Chi Cong
Looking for a special chance.......
Sumit Ghoshal
Occasional Advisor

Re: cmcld message in syslog

Hi,

The message above indicates problem in Cluster. cmcld process is the main MC Service guard process. It referes to cluster daemon and runs on every node within cluster.This process initializes cluster,monitors the health of cluster and recognizes when a failure within a cluster occurs.
In your case, it seems the cluster is break and node2 is out of cluster. check the output of :- cmviewcl -v
melvyn burnard
Honored Contributor

Re: cmcld message in syslog

The mesages indicate your system was unable to run the cmcld daemon for a period of time.
This is usually an indication of what is called a mini-hang.
Something is causing the system to be too busy to allow the daemon ti run.
This could be due to patch levels, or it could be issues on network keeping the system busy.
It also is more likely on single processor systems, or systems with only a few processors that run intensive applications.

If you wish to get to the bottom of this, I would recommend you log a support call with your local HP Response Centre, and they should supply you a utility to try to find what is causing this, known as timer9.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Bharat Katkar
Honored Contributor

Re: cmcld message in syslog

Hi,
Looking at the replies above node2 seems to be in problem but i would recommend to have look at both node1 and node2 as well.
If node1 doesn't get heartbeat from node2 that may be a problem with node1 as well.

Check the heartbit connections, NIC card on both the servers, and switch or HUB.

Look for any hardware problem reported for NIC's in /var/adm/syslog/syslog.log.

For heartbitting separate subnet is recommended in order to avoid congestion.

Hope that helps.
Regards,
You need to know a lot to actually know how little you know