Could not read messages from /usr/lbin/cmcld: Software caused connection abort

CA942032 · ‎02-10-2004

O/S version - 11.11
S/G version - 11.15

I'm trying to build my initial cluster config and am getting the above error in the syslog. I'm also getting this when attempting to run cmruncl:

# cmruncl -v
Error: Permission denied to 127.0.0.1
Warning: Local node is not currently configured in a cluster
cmruncl : Unable to determine the nodes on the current cluster
cmruncl : Either no cluster configuration file exists, or the file is corrupten

This is after the cmcheckconf and cmapplyconf have successfully completed. Any help is appreciated. Thanks.

Dietmar Konermann · ‎02-10-2004

Hi Doug,

looks like some node authorization problem. If you use .rhost then you need to add all nodes' root users to all nodes' ~root/.rhosts files. The same applies to /etc/cmcluster/cmclnodelist if you use this method.

See the Serviceguard manual for details.

Best regards...
Dietmar.

"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)

CA942032 · ‎02-10-2004

Thanks for the quick followup.

This is what my .rhosts file looks like on both nodes, unless I need to add something else.

# cat /.rhosts
cmaxx2 root
cmaxx1 root

cmaxx1 and cmaxx2 being the hostnames of the machines.

Jeff Schussele · ‎02-10-2004

Hi,

I always use the /etc/cmcluster/cmclnodelist file & make sure that the shortnames resolve correctly on all nodes.
Sounds like somehow or another the cluster binary thinks either localhost or 127.0.0.1 belongs in the cluster.
I'd rerun the check & apply scripts making sure that localhost is *not* refernced in the cluster ascii file.

Rgds,
Jeff

PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!

Sridhar Bhaskarla · ‎02-10-2004

Hi,

Make sure you can 'rlogin to itself' and see if it works.

Interestingly "Local node is not currently configured in a cluster" doesn't necessarily relate to .rhosts|cmclnodelist problem. How are you building the configuration?. Did you already generate the cluster ascii file and apply the configuration? Did you get any errors when you ran

#cmquerycl -C /etc/cmcluster/cmclconfig.ascii -n node1 -n node2

-Sri

You may be disappointed if you fail, but you are doomed if you don't try

CA942032 · ‎02-10-2004

I'm building the ascii file identical to your example, and also my notes, with the exception of course that I'm specifying the node names. I didn't receive any errors during the check or apply, but to be sure that I didn't, I've tried to rebuild this again with the same results.

The only thing I can think of is that the lock disk is on an EMC array, and not a HP branded disk array. I don't see why this would cause any issues, because all other functionality to that disk is fine.

I've verified that all node names can be resolved, and that I can rlogin to myself on both nodes.

Any and all help is greatly appreciated. Thanks.

Sridhar Bhaskarla · ‎02-10-2004

Well - in that case make sure you have entries in /etc/inetd.conf for hacl-cfg (two one with udp and the other with tcp). Also make sure you are not disallowing the local host in /var/adm/inetd.sec. If you see any hacl-cfg entry in that file, comment it out temporarily and see if it works. You will need to run 'inetd -c' after modifying inetd.conf and inetd.sec files.

-Sri

You may be disappointed if you fail, but you are doomed if you don't try

CA942032 · ‎02-10-2004

My /etc/inetd.conf was set correctly:
hacl-cfg dgram udp wait root /usr/lbin/cmclconfd cmclconfd -p
hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c

However I do not have a /var/adm/inetd.sec. Which means to me that all ports are wide open.

Sridhar Bhaskarla · ‎02-10-2004

Since you already compiled the configuration, I would suggest you run 'cmscancl -v -o /tmp/cluster.out' and attach the output.

-Sri

PS: Please do not assign points until your problem is fixed. Particularly a 7 indicates that your problem is almost solved.

You may be disappointed if you fail, but you are doomed if you don't try

CA942032 · ‎02-10-2004

Very interesting results here. In trying to run the cmscancl command, it would consistently fail on node 1, but succeed on node 2. Overall the command would fail because it couldn't stat node 1. I then compared the hostname on both nodes, and the second node was only using the node name. As it turns out, an admin here had set the hostname on my primary node to a fully qualified domain name. Even though I did not have a /etc/resolv.conf in place, and /etc/nsswitch.conf was set to use FILES only, and /etc/hosts had alias definitions for all nodes, apparently SG wants to use DNS no matter what. Once I changed the hostname back to just the node name, cmscancl completed on all nodes, and now the cluster runs.

Thank you Sri for your patience and for leading me down the right trail to get this resolved.

Stephen Doud · ‎02-11-2004

Too late for me to add value other than to refer you to this document should you see other "permission denied" problems:

DOCUMENT ID: UMCSGKBRC00008185
TITLE: Cluster Configuration Commands Fail with "permission denied"

In there, this:

CAUSE 7: Hostname resolution services (whether local /etc/hosts or
DNS) may be supplying a mix of fully qualified domain names (FQDN)
with simple hostnames.

-StephenD.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort

Re: Could not read messages from /usr/lbin/cmcld: Software caused connection abort