1826056 Members
4178 Online
109690 Solutions
New Discussion

Re: cmrunnnode

 
jamshed
Frequent Advisor

cmrunnnode

After editing and applying ascii for new node next message appeared.

# cmrunnode -v hp02
cmrunnode : Validating network configuration...
Gathering configuration information ........ Done
Not probing node hp02 as it is currently unreachable.
cmrunnode : Network validation complete
Unable to determine operating system version of TTMobile. Either cmclconfd was unable to run or the access is denied to this node in the security files (cmclnodelist or.rhosts) on the nodes that you are trying to start cluster on.
12 REPLIES 12
T G Manikandan
Honored Contributor

Re: cmrunnnode

make sure you have entry of the first node on the second node's /.rhosts file like

root

generic_1
Respected Contributor

Re: cmrunnnode

Check that cmclnodelist is correct.

Check over what is in /etc/hosts and what is in DNS to make sure they are supplying the same fully qualified domain names with simple hostnames. You can verify what is being returned with a netstat -i

You may want to add root to the /etc/cmnodelist too beef up security.

Also check the .rhosts file out on the nodes as mentioned above

It would not hurt to check over your resolv.conf file.
You can test it all out with a cmquerycl -v -C /etc/cmcluster/clusterfile.ascii -n nodea -n yournodeb

Happy Clustering
jamshed
Frequent Advisor

Re: cmrunnnode

Ufter verifying cluster configuration
next message appeared:

Error: Detected a partition of IP subnet 192.168.1.0.
Partition 1
hp01 lan0
Partition 2
hp02 lan0
Failed to evaluate network
G. Vrijhoeven
Honored Contributor

Re: cmrunnnode

Hi,

Can it be that you changed the subnet mask on one of your interfaces in the netconf file?
Can you remsh to an other server?


Gideon
Carsten Krege
Honored Contributor

Re: cmrunnnode

This message indicates that there is problem in IP connectivity. You should use ping on each node to verify that the IP addresses in subnet 192.168.1.0. can talk to each other. Also check that your routing is properly setup (netstat -rn).

Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG
jamshed
Frequent Advisor

Re: cmrunnnode

This is reply from both of nodes.

# netstat -rn
Routing tables
Destination Gateway Flags Refs Interface Pmtu
127.0.0.1 127.0.0.1 UH 0 lo0 4136
1.1.17.109 1.1.17.109 UH 0 lan1 4136
192.168.1.2 192.168.1.2 UH 0 lan0 4136
192.168.1.0 192.168.1.2 U 2 lan0 1500
1.1.17.0 1.1.17.109 U 2 lan1 1500
192.168.100.0 1.1.17.40 UG 0 lan1 0
127.0.0.0 127.0.0.1 U 0 lo0 0
default 1.1.17.250 UG 0 lan1 0
# remsh hp01
Please wait...checking for disk quotas
You have mail.

Value of TERM has been set to "dtterm".
WARNING: YOU ARE SUPERUSER !!

# netstat -rn
Routing tables
Destination Gateway Flags Refs Interface Pmtu
127.0.0.1 127.0.0.1 UH 0 lo0 4136
1.1.17.100 1.1.17.100 UH 0 lan1:1 4136
1.1.17.108 1.1.17.108 UH 0 lan1 4136
192.168.1.1 192.168.1.1 UH 0 lan0 4136
192.168.1.0 192.168.1.1 U 2 lan0 1500
1.1.17.0 1.1.17.108 U 3 lan1 1500
1.1.17.0 1.1.17.100 U 3 lan1:1 1500
1.1.18.0 1.1.17.15 UG 0 lan1 0
192.168.100.0 1.1.17.40 UG 0 lan1 0
127.0.0.0 127.0.0.1 U 0 lo0 0
default 1.1.17.250 UG 0 lan1 0
#
G. Vrijhoeven
Honored Contributor

Re: cmrunnnode

Hi again,

Could you please post the output of:
#cmcheckconf -v -C /etc/cmcluster/cmclconfig.ascii
( or other ascii file)
May be this will give us some more info.

Gideon
jamshed
Frequent Advisor

Re: cmrunnnode

# cmcheckconf -k -v -C /etc/cmcluster/cluster.asc

Checking cluster file: /etc/cmcluster/cluster.asc
Checking nodes ... Done
Checking existing configuration ... Done
Gathering configuration information ... Done
Gathering configuration information ... Done
Gathering configuration information ............ Done

Warning: The disk at /dev/dsk/c5t0d0 on node hp02 does not have an ID, or a disk
label.
Warning: Disks which do not have IDs cannot be included in the topology descript
ion.
Use pvcreate(1M) to initialize a disk for LVM or,
use vxdiskadm(1M) to initialize a disk for VxVM.
Internal error: Number of volume groups sent (5) doesn't match (4) reply from no
de hp02
Error: Unable to determine lvm configuration: failed to receive lvm query reply
from node hp02
Error: Detected a partition of IP subnet 192.168.1.0.
Partition 1
hp01 lan0
Partition 2
hp02 lan0
Failed to evaluate network
cmcheckconf : Unable to reconcile configuration file /etc/cmcluster/cluster.asc
with discovered configuration information.
G. Vrijhoeven
Honored Contributor

Re: cmrunnnode

Hi,

Looks like hp01 is not able to access hp02.

Are you able to remsh from hp01 to hp02?
Is your subnet mask the same on both servers?
Sorrie for the echo, but i think that is where we can find the solution.

Gideon
jamshed
Frequent Advisor

Re: cmrunnnode

# remsh hp01
Please wait...checking for disk quotas
You have mail.



Value of TERM has been set to "ansi".
WARNING: YOU ARE SUPERUSER !!

#
Carsten Krege
Honored Contributor

Re: cmrunnnode

Pretty weird. For me it looks like some messages go through others don't. You should check a few things:

- you use a recent SG version, there are known problems on SG A.11.09 with obsolete patch level.

- on hp01: # ping -n 20 192.168.1.2
on hp02: # ping -n 20 192.168.1.1

check for packet loss.

- as suggested before, check for identical subnet mask in "ifconfig lan0" output on both nodes

- check /etc/inetd.conf for correct entries for the cmclconfd lines

- check /var/adm/inetd.sec that messages to port 5302 (hacl-cfg) can pass

- make sure that there is no firewall blocking SG traffic

- check for duplicate IP addresses, especially for 192.168.1.2! E.g. check nettl logs or shutdown the server and check if the IP is still available.

Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG
Stephen Doud
Honored Contributor

Re: cmrunnnode

Let's look at each error, and address them in order (because perhaps solving the first error may also correct subsequent errors):

Internal error: Number of volume groups sent (5) doesn't match (4) reply from node hp02
Error: Unable to determine lvm configuration: failed to receive lvm query reply from node hp02
Error: Detected a partition of IP subnet 192.168.1.0.
Partition 1
hp01 lan0
Partition 2
hp02 lan0
Failed to evaluate network
cmcheckconf : Unable to reconcile configuration file /etc/cmcluster/cluster.asc
with discovered configuration information.

----------
The "internal error" issue is addressed with a patch for SG version A.11.13
and A.11.14: PHSS_28394 (11.13) and PHSS_27246 (11.14).

"To avoid this behavior, HP recommends not using the '-k' option for cmcheckconf(1M) and cmapplyconf(1M) if all volume groups specified in the cluster ascii file are not imported on all cluster nodes."

What version of Serviceguard is loaded on your cluster?
To verify the version, try this command:
# what /usr/lbin/cmcld | grep Date

You may get something like the following back:
A.11.14 Date: 01/15/2003; PATCH: PHSS_27725

Note the patch level is also listed in this example.

Now to the second error message:
Error: Unable to determine lvm configuration: failed to receive lvm query reply from node hp02

This error message is actually part of the previous problem (per the patch details).

Now the third error message:
Error: Detected a partition of IP subnet 192.168.1.0

The release notes for both A.11.13 and A.11.14 discuss this situation:

"What is the problem? The cmquerycl, cmcheckconf, and cmapplyconf commands may fail with complaints about a network partition. This problem is especially likely when using FibreChannel networks.

What is the workaround? Retry the command. If the problem persists, validate the state of cluster networking by using the lanscan and netstat commands. After networking problems have been resolved, use the cmquerycl, cmcheckconf or cmapplyconf command again."


You also attached a "netconf" from "hp02"

In it, the same subnet mask is given for two diverse IP's:

INTERFACE_NAME[0]=lan0
IP_ADDRESS[0]=192.168.1.2
SUBNET_MASK[0]=255.255.255.0

INTERFACE_NAME[1]=lan1
IP_ADDRESS[1]=1.1.17.109
SUBNET_MASK[1]=255.255.255.0

With such different IP addresses assigned, perhaps your subnet mask should be different??

Using a website such as the one that follows, you can calculate a typical subnet mask.
For the IP of 1.1.17.109, the subnet mask is 255.0.0.0

http://www.telusplanet.net/public/sparkman/netcalc.htm

-sd-