3 Node Cluster cmcld Process down

Ajin_1 · ‎09-04-2012

Hi experts

we received a cmcld process down alert in one server

we check the process status using

ps -ef |grep cmcld

root 5028 5026 0 Aug 21 ? 41:52 /usr/lbin/cmcld -j

In other nodes it shows

root 25811 25810 0 Apr 4 ? 444:22 /usr/lbin/cmcld -m -n node10101 -n node10301 -n node10201

root 26703 26702 0 Apr 4 ? 442:59 /usr/lbin/cmcld -m -n node 10101 -n node10301 -n node10201

What does it means

/usr/lbin/cmcld -j

Thanks in advance

Thanks & Regards
Ajin.S
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Matti_Kurkela · ‎09-04-2012

My guess is that the process with PID 5056 is either "cmrunnode" or "cmruncl", and it is running "cmcld -j" as an attempt to join an existing cluster. Maybe it cannot join for some reason?

In the other nodes, the cmcld processes are running in full cluster member mode, with a list of all the nodes the cluster is supposed to have.

Unless the cmcld process was properly shut down using the "cmhaltnode" command, the single node should have crashed & rebooted after cmcld died, as the kernel safety timer should have expired and triggered a TOC.
Did the single node reboot on Aug 21?

If it did not reboot, you probably should reboot it to be safe, and then check Serviceguard version and system patch level, and look for any indications of hardware failures: the cmcld process is normally very stable. It should not die without a reason.

What does the "cmviewcl" command report on the other nodes? If the single node is still down according to them, then there is a problem (possibly a network issue) that prevents the single node from hearing the cluster heartbeat from the other nodes, and/or prevents the other nodes from hearing any traffic from the single node.

As this is a 3-node cluster, the single node has lost cluster quorum and must assume that the other nodes may be happily running on the other side of the network break, using the cluster disks and IP addresses. So the single node MUST NOT use any cluster disks/IPs, until it either can communicate with at least one other cluster node again, or the sysadmin explicitly uses the cmruncl command to start the cluster in single-node mode (effectively telling Serviceguard that all the other nodes are down for sure).

MK

Ajin_1 · ‎09-06-2012

Hi MK

Thanks for prompt reply ,iam posting the questions you asked

Did the single node reboot on Aug 21?

Yes we have ,reboot node2

What does the "cmviewcl" command report on the other nodes?

All packages are running in Primary node.

Thanks in Advance for your kind support

Thanks & Regards
Ajin.S
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Matti_Kurkela · ‎09-06-2012

>>What does the "cmviewcl" command report on the other nodes?

>All packages are running in Primary node.

Yes, but what does the cmviewcl tell you about the status of the nodes? Do the other nodes think this node is halted or running? Or maybe starting? Or "unknown"?

MK

Categories

Company

Local Language

Forums

Discussions

Knowledge Base

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

3 Node Cluster cmcld Process down

3 Node Cluster cmcld Process down

Re: 3 Node Cluster cmcld Process down

Re: 3 Node Cluster cmcld Process down

Re: 3 Node Cluster cmcld Process down