Operating System - HP-UX
1841974 Members
4042 Online
110185 Solutions
New Discussion

Re: ServiceGuard startup problem

 
Wilder Mellotto
Frequent Advisor

ServiceGuard startup problem

Hi,

I have a cluster of 3 nodes running about 3 months ago without any problems. This 3 nodes have been powered off and powered on again more than one time without problem.

Now, when I try to form a cluster, this erros apears:

Error: Error performing security validation. Please verify that identd is running properly.
Internal error: Unable to open communications to configuration daemon: Software caused connection abort
Error: Unable to connect to configuration database.
Internal error: Unable to open communications to configuration daemon: No such file or directory


Then I tried to check my configuration:

[osiris:root]/etc/cmcluster > cmcheckconf -v -C /etc/cmcluster/peixoto

Checking cluster file: /etc/cmcluster/peixoto
Note : a NODE_TIMEOUT value of 2000000 was found in line 67. For a
significant portion of installations, a higher setting is more appropriate.
Refer to the comments in the cluster configuration ascii file or Serviceguard
manual for more information on this parameter.
Checking nodes ... Done
Checking existing configuration ... Done
Gathering configuration information ... Done
Gathering configuration information ... Done
Gathering configuration information ..
Gathering storage information ..
Found 55 devices on node osiris
Analysis of 55 devices should take approximately 6 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Found 14 volume groups on node osiris
Analysis of 14 volume groups should take approximately 1 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
.....
Gathering Network Configuration ........ Done

Error: Error performing security validation. Please verify that identd is running properly.
Error: Unable to connect to node isis: Software caused connection abort
Error: Error performing security validation. Please verify that identd is running properly.
Error: Unable to connect to node anubis: Software caused connection abort
Warning: Not probing node anubis as it is currently unreachable.
This may cause network partitions to be reported.
Warning: Not probing node isis as it is currently unreachable.
This may cause network partitions to be reported.
cmcheckconf : Unable to reconcile configuration file /etc/cmcluster/peixoto
with discovered configuration information.

Please, can you help me?
11 REPLIES 11
Marcel Boogert_1
Trusted Contributor

Re: ServiceGuard startup problem

Hi there,

Check this document and see if it applies:
http://docs.hp.com/en/5991-1101/ch05s09.html

Regards, MB.
Chan 007
Honored Contributor

Re: ServiceGuard startup problem

HI,

Try .rhosts for root on all 3 servers.

Chan
Wilder Mellotto
Frequent Advisor

Re: ServiceGuard startup problem

Some more information:

server: OSIRIS
-----------------------------------
INTERFACE_NAME[0]=lan1
IP_ADDRESS[0]=95.1.1.1
SUBNET_MASK[0]=255.0.0.0
BROADCAST_ADDRESS[0]=95.255.255.255
INTERFACE_STATE[0]=up
DHCP_ENABLE[0]=0

INTERFACE_NAME[1]=lan2
IP_ADDRESS[1]=96.1.1.1
SUBNET_MASK[1]=255.0.0.0
BROADCAST_ADDRESS[1]=96.255.255.255
INTERFACE_STATE[1]=up
DHCP_ENABLE[1]=0
===================================







server: ISIS
------------------------------------
INTERFACE_NAME[0]=lan0
IP_ADDRESS[0]=96.1.1.2
SUBNET_MASK[0]=255.0.0.0
BROADCAST_ADDRESS[0]=96.255.255.255
INTERFACE_STATE[0]=up
DHCP_ENABLE[0]=0

INTERFACE_NAME[1]=lan1
IP_ADDRESS[1]=95.1.1.2
SUBNET_MASK[1]=255.0.0.0
BROADCAST_ADDRESS[1]=95.255.255.255
INTERFACE_STATE[1]=up
DHCP_ENABLE[1]=0
===================================







server: ANUBIS
-----------------------------------
INTERFACE_NAME[0]=lan1
IP_ADDRESS[0]=96.1.1.25
SUBNET_MASK[0]=255.0.0.0
BROADCAST_ADDRESS[0]=96.255.255.255
INTERFACE_STATE[0]=up
DHCP_ENABLE[0]=0

INTERFACE_NAME[1]=lan2
IP_ADDRESS[1]=95.1.1.25
SUBNET_MASK[1]=255.0.0.0
BROADCAST_ADDRESS[1]=95.255.255.255
INTERFACE_STATE[1]=up
DHCP_ENABLE[1]=0


I create $HOME/.rhosts in both 3 servers with this:

anubis root
isis root
osiris root


My /etc/hosts is the same on both 3 servers, when I do a nslookup to IP or host, all servers ok.

I create /etc/cmcluster/cmclnodelist on both 3 servers with this:

anubis root
isis root
osiris root


linkloop betwen servers works.
rlogin works.
Greg Vaidman
Respected Contributor

Re: ServiceGuard startup problem

did you recently upgrade serviceguard (or apply a patch)?

or, have you recently turned off identd in /etc/inetd.conf?

there was a change in going from 11.14 to 11.15 (or 11.15 to 11.16, I don't rememeber exactly) that identd is required with the new version. You can either enable identd on all your cluster nodes, or turn off this requirement with an option in your cluster config file.
Sameer_Nirmal
Honored Contributor

Re: ServiceGuard startup problem

Check of the "auth" or "identd" entries in
/etc/services and /etc/inetd.conf files.

Wilder Mellotto
Frequent Advisor

Re: ServiceGuard startup problem

No changes were made in patches.

I do not changed /etc/services or /etc/inetd.conf.

My cluster was up until yesterday, when I did the cmhaltcl -f I got error to stop cluster util a VG still active (lock_vg and lock_pv). Then mannualy
vgchange -a n vg25
shutdown -hy 0 (all servers)
Andrew Edenburn
New Member

Re: ServiceGuard startup problem

I ran into this error last week. Here is a work around that will allow you to run

1. vi /etc/inetd.conf
2. Modify the line:
hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c

to be:

hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c -i

(This will allow for service guard to ignore the identd.

You will then be able to see the cluster and it's packages. I have seen this on RP7410's and N4000 so far.
IT_2007
Honored Contributor

Re: ServiceGuard startup problem

check /etc/inted.conf for ha entries and it should be like this:

hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c -i
nanan
Trusted Contributor

Re: ServiceGuard startup problem

After change all of the system's inetd.conf
, execute "inetd -c" then try again

inetd.conf :
hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c -i
rariasn
Honored Contributor

Re: ServiceGuard startup problem

More files,

#grep -i hacl /etc/services
hacl-hb 5300/tcp # High Availability (HA) Cluster heartbeat
hacl-gs 5301/tcp # HA Cluster General Services
hacl-cfg 5302/tcp # HA Cluster TCP configuration
hacl-cfg 5302/udp # HA Cluster UDP configuration
hacl-probe 5303/tcp # HA Cluster TCP probe
hacl-probe 5303/udp # HA Cluster UDP probe
hacl-local 5304/tcp # HA Cluster Commands
hacl-test 5305/tcp # HA Cluster Test
hacl-dlm 5408/tcp # HA Cluster distributed lock manager

rgs,

ran
Wilder Mellotto
Frequent Advisor

Re: ServiceGuard startup problem

Thanks for all responses.

I found my problem. I don't know why, but my cluster is now working after a storage reboot. I did a power cycle in my 3 servers and my storage and now everything ok.

My other problem was my lock disk. One disk has stopped in one path
c50t1d1 - primary
c60t1d1 - alternate

diskinfo, dd and all commands issued in c50 worked, but in c60 doesn't.

I opened a call in EMC to diagnose my storage, nothing was found, but my disk is now responding to all commands after the storage reboot.

Thanks again.