Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
cancel
Showing results for 
Search instead for 
Did you mean: 

MC/ServiceGuard Problem: cmcld

SOLVED
Go to solution
Mike Seerden
Advisor

MC/ServiceGuard Problem: cmcld

Hi All,

I have two N class systems running MC/ServiceGuard and one node is up and running and I can't get the other node to join the cluster. Here is the output from running cmrunnode and syslog.log. Any help would be appreciated.

cmrunnode
cmrunnode : Waiting for cluster to form...........
cmrunnode : Node systemname unable to join cluster. Check syslog
on that node for information.

Output from syslog:

Aug 22 12:02:34 systemname CM-CMD[4119]: cmruncl
Aug 22 12:02:46 systemname CM-CMD[4126]: cmrunnode
Aug 22 12:02:46 systemname cmclconfd[4129]: Executing "/usr/lbin/cmcld" for node systemname.domain.name.com
Aug 22 12:02:47 systemname cmcld: Daemon Initialization - Maximum number of packages supported for this incarnation is 10.
Aug 22 12:02:47 systemname cmcld: Reserving 2248 Kbytes of memory and 74 threads
Aug 22 12:02:47 systemname cmcld: The maximum # of concurrent local connections to the daemon that will be supported is 25.
Aug 22 12:02:47 systemname cmcld: IP address 10.0.0.1 does not appear to be available
Aug 22 12:02:47 systemname cmclconfd[4129]: The ServiceGuard daemon, /usr/lbin/cmcld[4130], exited with a status of 1.
Aug 22 12:02:47 systemname cmlogd: Unable to initialize with ServiceGuard cluster daemon (cmcld): Connection reset by peer
Aug 22 12:02:47 systemname cmsrvassistd[4134]: Lost connection with ServiceGuard cluster daemon (cmcld): Connection reset by peer

Thanks
4 REPLIES
CHRIS ANORUO
Honored Contributor

Re: MC/ServiceGuard Problem: cmcld

Mike, You have a network disconnection problem. Trouble shoot on the cable connections.
When We Seek To Discover The Best In Others, We Somehow Bring Out The Best In Ourselves.
John Palmer
Honored Contributor
Solution

Re: MC/ServiceGuard Problem: cmcld

Mike,

From the syslog you supplied, cmcld is complaining about the loss of IP address 10.0.0.1. Is this configured as a heartbeat address on the server that fails to join the cluster?

If so, have you changed anything regarding the network configuration such that this address is no longer configured?

Check your serviceguard configuration file for HEARTBEAT and /etc/rc.config.d/netconf for your network settings. You can also run netstat -in to check how your what IP addresses are configured

Regards,
John

Re: MC/ServiceGuard Problem: cmcld

Sounds like you have a configuration problem.
There are two useful commands to help here.
1. do a cmquerycl -v -C /tmp/testascii -n -n and see what the cluster services actually see out there. Also sompare this with your cluster ascii file.

2. use the cmscancl command, which will output a file to /tmp/scancl.out, and then review this file for info on the networks, and what is in the cluster binary.

3. you should check and confirm all of your network configurations for inconsistencies or changes.

HTH
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Mike Seerden
Advisor

Re: MC/ServiceGuard Problem: cmcld

Thank you all. You pointed me in the right direction. Some how someone changed the IP address on one of the nodes lan cards. I am the only admin but a few developers have some root capability. Will have to go to the logs and find out who did that.

All is working fine now.