Operating System - Linux
1753468 Members
4686 Online
108794 Solutions
New Discussion юеВ

Problem with network config

 
Pramod_4
Trusted Contributor

Problem with network config

I am trying to create a two node cluster on RHL AS3 with MC/SG 11.15. I get the following error during cmcheckconf:
-----------------------------------------------
# cmcheckconf -v -C TEST_cluster.ascii

Checking cluster file: TEST_cluster.ascii
Note : a NODE_TIMEOUT value of 2000000 was found in line 111. For a
significant portion of installations, a higher setting is more appropriate.
Refer to the comments in the cluster configuration ascii file or ServiceGuard
manual for more information on this parameter.
Checking nodes ... Done
Checking existing configuration ... Done
Warning: Can not find configuration for cluster DDE
Gathering configuration information ... Done
Gathering configuration information ..
Gathering storage information ..
Found 1 devices on node carol
Found 1 devices on node marsha
Analysis of 2 devices should take approximately 1 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
.
Gathering Network Configuration ........ Done

Error: Unable to determine network configuration: failed to receive net probe reply from node carol
Failed to probe network
Warning: Network interface eth0 on node marsha couldn't talk to itself.
Error: Detected a partition of IP subnet 10.42.2.0.
Partition 1
carol eth0
Partition 2
marsha eth0
Failed to evaluate network
cmcheckconf : Unable to reconcile configuration file TEST_cluster.ascii
with discovered configuration information.
#
----------------------------------------------

Any idea ?


Thanks,
Pramod

11 REPLIES 11
Serviceguard for Linux
Honored Contributor

Re: Problem with network config

One of the most common problems is with firewalls. First try turning the firewall off entirely and try cmcheckconf again. If that doesn't work, post TEST_cluster.ascii. More questions may follow after that, but they'll have some good info to be based on.
Pramod_4
Trusted Contributor

Re: Problem with network config

In fact, cmcheckconf output shown above is after turning off iptable and unloading the iptable module from the running kernel.

TEST_cluster.ascii is attached.

Any other thought ?
Steven E. Protter
Exalted Contributor

Re: Problem with network config

Are all the ip addresses set without conflict on the two nodes?

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Pramod_4
Trusted Contributor

Re: Problem with network config

There is no conflit in IP address. I can SSH to both nodes without any problem. I am able to SSH to each other as well.

Serviceguard for Linux
Honored Contributor

Re: Problem with network config

One point and something else to try to get more info.

First, it is odd (but not necessarily a problem) that the LUNs identified as the CLUSTER_LOCK_LUN is different for the two nodes. Most clusters would have those the same. Are these really the same physical LUN?

Try "cmquerycl -n node1 -C filename" on node1
then try it on node2. This helps determine the source of communication problems.
Pramod_4
Trusted Contributor

Re: Problem with network config

As you could see in the cmcheckconf output it didn'y complain about the disk but network.

Yes, they are same physical EMC devices referenced by different device files on two nodes.

I did the cmcheckconf on both nodes and the error is abosolutely same.

Any other thoughts ?

melvyn burnard
Honored Contributor

Re: Problem with network config

As per the following:
Error: Unable to determine network configuration: failed to receive net probe reply from node carol
Failed to probe network
Warning: Network interface eth0 on node marsha couldn't talk to itself.
Error: Detected a partition of IP subnet 10.42.2.0.
Partition 1
carol eth0
Partition 2
marsha eth0

This is usually an indication of a physical networking issue, not Serviceguard issues.
This could be due to any one of a number of issues.
Is there a router between the two nodes?
Is the subnet mask the same?
Is the Broadcast address the same?

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Pramod_4
Trusted Contributor

Re: Problem with network config

Melvyn,

NIC on both nodes are on the same subnet and they have same broadcast address.

I have even tried with cross over connection between these two nodes but the result is same.

NIC is Broadcom BCM5703 and the driver used is tg3.

Any other thoughts ?


Serviceguard for Linux
Honored Contributor

Re: Problem with network config

I'd suggest that you "triple check" that the firewall is off. Temporarily turn on something like telnet and see if it works.

I didn't see any specific response about my request that you run the cmquerycl on the individual nodes. Did you do that. This can be helpful in identifying whether a node has "permission" to talk to itself over the network. Don't rely on just ssh working.