1825985 Members
3266 Online
109690 Solutions
New Discussion

broke my cluster??

 
SOLVED
Go to solution
Mark Harshman_1
Regular Advisor

broke my cluster??

running HPUX11i on an rp5450 cluster. I loaded some patch bundles on one server, and suddenly cannot talk much to my cluster. I had loaded the same patch bundles (June 2006) to a similar cluster with no issues. The messages i am getting are:

cmrunnode : Unable to determine the nodes on the current cluster
cmrunnode : Either no cluster configuration file exists, or the file is corrupted, or cmclconfd is unable to run
Fail to load data from configuration database.
Unable to open communications to configuration daemon: Not owner
Unable to connect to configuration database.


not sure what i broke. when i do a "cmviewcl" i only get partial output...

thanks
Never underestimate the power of stupid people in large groups
9 REPLIES 9
Patrick Wallek
Honored Contributor

Re: broke my cluster??

The first thing you should do is go through the patches loaded and see which ones effect MC/SG. When you find those, then go through the README files for the patches and see what special actions you have to do when those are loaded. I don't remember any details at the moment, but some MC/SG patches had some special instructions detailing things you needed to do when the patches were loaded.

The other thing to do would be to use swremove to remove the patches that effect MC/SG.

Re: broke my cluster??

Mark,

Is this Serviceguard 11.16 by any chance?

If so you may have installed a patch which changes the way Serviceguard does authentication. It will have been described in the special instructions for that patch.

You should carefully review the following 2 documents and understand their implications:

http://docs.hp.com/en/6283/SGsecurityfiles.pdf

http://docs.hp.com/en/5874/securingserviceguard_nov2005.pdf

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Stephen Doud
Honored Contributor

Re: broke my cluster??

What version of Serviceguard are you working with?

What does syslog.log have for additional error messages or related messages?

We have seen this when a Serviceguard patch has been installed that includes a dependency on 'identd' but that daemon has been commented out of /etc/inetd.conf on one/both nodes.

Also, we have seen it when 'auth' is substituted where identd should be.
# grep 113 /etc/services

If it's wrong, change it and run 'inetd -c'


Mark Harshman_1
Regular Advisor

Re: broke my cluster??

I am actually running 11.14 serviceguard. My /etc/services file is fine. I installed these same bundles on another 11.14 cluster with no issues. Below is the messages in my syslog. note that 127.0.0.1 is in my hosts file as the localhost. thanks

Jun 18 15:24:46 ippiux7 cmclconfd[13529]: Unable to connect to server 127.0.0.1 on port 113 (Connection refused).
Jun 18 15:24:46 ippiux7 cmclconfd[13529]: Unable to properly gather the remote user for fd 4. Please make sure the remote node is running identd.
Never underestimate the power of stupid people in large groups
Stephen Doud
Honored Contributor

Re: broke my cluster??

port 113 is the identd port - check /etc/inetd.conf - see if ident is commented out - if so, uncomment and run 'inetd -c'
Mark Harshman_1
Regular Advisor

Re: broke my cluster??

for some reason it seems like i am having an issue with the user name authentication. These are the two entries from the two nodes in the log.

Jun 19 08:45:04 ippiux7 inetd[767]: ident/tcp: Died on signal 13
Jun 19 08:46:06 ippiux7 CM-CMD[11492]: cmrunnode
Jun 19 08:46:07 ippiux7 cmclconfd[11505]: The identd authenticated user name (@


also

Jun 19 08:45:58 ippiux8 cmclconfd[19155]: The identd authenticated user name () did not match with the sender user name (root) on ippiux7. Exiting.
Jun 19 08:46:06 ippiux8 cmclconfd[19157]: The identd authenticated user name () did not match with the sender user name (root) on ippiux7. Exiting.

Never underestimate the power of stupid people in large groups
Stephen Doud
Honored Contributor
Solution

Re: broke my cluster??

Those last messages were a good clue:
"cmclconfd[26007]: The identd authenticated user name (0) did not
match with the sender user name (root)"

If you have configured /var/adm/inetd.sec , include the following line in it:

ident allow 127.0.0.1

This symptom was caused by changing the server to a Trusted System.
identd normally executes with user 'bin', but when the server is
Trusted, it must execute as root.

To verify whether Trusted Systems is activated, use this command:

# /usr/sbin/authck -p

If not enabled, the following will result:
authck: cannot open Protected Password hierarchy.

If Trusted Systems is enabled, details about the hierarchy are listed.


There are two remedies to this issue.

1) Modify the ident entry in /etc/inetd.conf to operate as root.

For 11.11, change this:
ident stream tcp wait bin /usr/lbin/identd identd
to this:
ident stream tcp wait root /usr/lbin/identd identd


For 11.23, change this
auth stream tcp wait bin /usr/lbin/identd identd
to this:
auth stream tcp wait root /usr/lbin/identd identd


After saving the change in the file, remember to perform
'inetd -c'.


2) Install a patch to permit inetd-called services to operate in a Trusted
environment without changing the owner of the service.

For 11.11: PHCO_30912 s700_800 11.00 libsec cumulative patch
For 11.23: PHCO_32794 s700_800 11.23 libsec cumulative patch
Mark Harshman_1
Regular Advisor

Re: broke my cluster??

i am running trusted systems. I did change the inetd file as mentioned and re-read. I do have the patch PHCO_30913 already installed, which i believe is the 11.11 libsec cumlative patch. Still having same issue. I can bypass this problem by adding the "-i" parameter on the inetd entries for hacl, but would prefer not too.
Never underestimate the power of stupid people in large groups
Stephen Doud
Honored Contributor

Re: broke my cluster??

Install PHNE_26305 - s700_800 11.11 sendmail(1m) 8.9.3 patch