Operating System - HP-UX
1847232 Members
2616 Online
110263 Solutions
New Discussion

Re: Cmcheckconf giving some network errors

 
Fenny_1
Super Advisor

Cmcheckconf giving some network errors

Hi,
I try to verify my cluster configuration using cmcheckconf which is finishing successfully when i did that almost a month back. but now its giving me so many errors.

ppora01:/#cmcheckconf -v -C /etc/cmcluster/cluster.ascii

Checking cluster file: /etc/cmcluster/cluster.ascii
Checking nodes ... Done
Checking existing configuration ... Done
Gathering configuration information ... Done
Gathering configuration information ... Done
Gathering configuration information ..
Gathering storage information ..
Found 294 devices on node tppovw01
Found 208 devices on node tppovw02
Found 358 devices on node tppora01
Found 430 devices on node tppora03
Found 330 devices on node tppora02
Found 330 devices on node tppora04
Found 142 devices on node tppora05
Found 218 devices on node tppora06
Analysis of 2310 devices should take approximately 29 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Found 30 volume groups on node tppovw01
Found 22 volume groups on node tppovw02
Found 23 volume groups on node tppora01
Found 20 volume groups on node tppora03
Found 17 volume groups on node tppora02
Found 17 volume groups on node tppora04
Found 21 volume groups on node tppora05
Found 27 volume groups on node tppora06
Analysis of 177 volume groups should take approximately 1 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
.....
Gathering Network Configuration .............. Done

Error: tppora01 lan3 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppora02 lan3 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppora03 lan3 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppora04 lan3 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppora06 lan0 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppovw01 lan0 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: tppovw02 lan0 can communicate with tppora05 lan0 over subnet 10.1.7.0
on the IP level, but not on the DLPI level.
There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.
Error: Non-uniform connections detected,
tppora01 lan3 10.1.7.10 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppora01 lan3 10.1.7.10.
This could be due to heavy network traffic, or heavy load on tppora01.
Error: Non-uniform connections detected,
tppora02 lan3 10.1.7.11 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppora02 lan3 10.1.7.11.
This could be due to heavy network traffic, or heavy load on tppora02.
Error: Non-uniform connections detected,
tppora03 lan3 10.1.7.12 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppora03 lan3 10.1.7.12.
This could be due to heavy network traffic, or heavy load on tppora03.
Error: Non-uniform connections detected,
tppora04 lan3 10.1.7.13 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppora04 lan3 10.1.7.13.
This could be due to heavy network traffic, or heavy load on tppora04.
Error: Non-uniform connections detected,
tppora06 lan0 10.1.7.72 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppora06 lan0 10.1.7.72.
This could be due to heavy network traffic, or heavy load on tppora06.
Error: Non-uniform connections detected,
tppovw01 lan0 10.1.7.20 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppovw01 lan0 10.1.7.20.
This could be due to heavy network traffic, or heavy load on tppovw01.
Error: Non-uniform connections detected,
tppovw02 lan0 10.1.7.25 successfully received from tppora05 lan0 10.1.7.71
but tppora05 lan0 10.1.7.71 did not receive from tppovw02 lan0 10.1.7.25.
This could be due to heavy network traffic, or heavy load on tppovw02.
Failed to evaluate network
Error: Unable to probe cluster configuration. Unable to connect to quorum server at tppqrm01 from node tppora03: Connection timed out.
Check the syslog file for more information.
Error: Unable to probe cluster configuration. Unable to connect to quorum server at tppqrm01 from node tppora04: Connection timed out.
Check the syslog file for more information.
Error: Unable to probe cluster configuration. Unable to connect to quorum server at tppqrm01 from node tppovw02: Connection timed out.
Check the syslog file for more information.
cmcheckconf : Unable to reconcile configuration file /etc/cmcluster/cluster.ascii
with discovered configuration information.


Can anyone guide me how to fix the network issues. All nodes can ping each other on all subnets.
waiting for an urgent response
7 REPLIES 7
Steven E. Protter
Exalted Contributor

Re: Cmcheckconf giving some network errors

Shalom,

There is possibly a network component between the two interfaces
that does not allow any data link level traffic through, which violates
a Serviceguard requirement.

It would appear there is either an inconsistency in network configuration, your choices or possibly an issue with the network infrastructure.

The message appears to be meaningful, which is like winning the lottery in Unix.

Have you made the required changes to the inetd.conf file to acommodate SG 11.16 and above?

What were the resules of the cmquerycl step?

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Rita C Workman
Honored Contributor

Re: Cmcheckconf giving some network errors

...here's a couple thoughts I have.

On those connections to the quorum server errors:
Ensure you have QS up and properply running on the QS Server.
Then ensure that your networking person hasn't shutdown port 1238 from allowing the other nodes to communicate with it. - Try going on your QS Server and run
netstat -a | grep 1238
Make sure it shows it's established. If it doesn't (maybe only shows listen) then you have a communication issue.

On those other network errors...it might be again the network person has ports blocked.
Just a thought.....

Regards,
Rita
melvyn burnard
Honored Contributor

Re: Cmcheckconf giving some network errors

This is indicating that you have a network device out on your lan that is allowing tcp traffic through, but denying DLPI traffic. This has to be enabled, talk to your network team
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Fenny_1
Super Advisor

Re: Cmcheckconf giving some network errors

Below is the reponse which i receive in cmquerycl

tppora01:/#cmquerycl -C /etc/cmcluster/cluster.ascii
Warning: -C option ignored because there is no -n or -c option.


Cluster Name Node Name
TP_PROD_ClUSTER
tppora01
tppora02
tppora03
tppora04
tppora05
tppora06
tppovw01
tppovw02

and for netstat

ppora01:/#netstat -a | grep 1238
tcp 0 0 tppora01.64779 tppqrm01.1238 ESTABLISHED
tcp 0 0 tppora01.55969 tppqrm01.1238 ESTABLISHED
tcp 0 0 tppora01.61238 tppora03.hacl-gs ESTABLISHED

How can i check that DLPI ports are blocked or not. Which port # refers to this DLPI?

Waiting for your replies?
Sameer_Nirmal
Honored Contributor

Re: Cmcheckconf giving some network errors

Can you post the output of the command
# cmquerycl -n -n .. -C

Put all those nodes in the above command with -n which are part of the cluster.

Run ifconfig command on all those network interfaces shown in those messages from respective nodes.
Prashanth.D.S
Honored Contributor

Re: Cmcheckconf giving some network errors

Hi,

As recommended by Sameer check for the netmask configuration or entry using ifconfig command ..

If not right then correct it using the

#ifconfig unplumb

Then verify...

Verify the file /etc/rc.config.d/netconf

Best Regards,
Prashanth
Stephen Doud
Honored Contributor

Re: Cmcheckconf giving some network errors

create a new cluster configuration file:
# cmquerycl -C cluster.asc -n -n ...

Next reconstitute a cluster configuration file from the cluster binary file:
# cmgetconf > cluster.asc.orig

Compare the two to see if any NICs are missing. If any, investigate how their configuration has changed. For instance, if they are standby LAN NICs, verify the netmask is set to 00 instead of ff
(use the ifconfig lanX unplumb/plumb example shown above)