Operating System - HP-UX
1833267 Members
3375 Online
110051 Solutions
New Discussion

Re: Should heartbeat subnet fail over?

 
SOLVED
Go to solution
Rudy Williams
Regular Advisor

Should heartbeat subnet fail over?

Hello--

I have three NICs in both nodes in a 2 node cluster (MC/SG 11.15) on HP-UX 11.11. lan0 on both nodes is connected via a cross-over cable for heartbeat (via HEARTBEAT_IP). That subnet is not monitored.

lan1 is configured as a heartbeat as well but is used for primary data traffic; this subnet is monitored. lan2 is the standby card.

If lan0 fails, cmcld tries to use lan2.

Jan 29 14:53:21 emroltp cmcld: lan0 failed
Jan 29 14:53:21 emroltp cmcld: Subnet 192.168.244.20 switched from lan0 to lan2
Jan 29 14:53:21 emroltp cmcld: lan0 switched to lan2

Is this normal? I thought that the subnet configured on lan1 was used for heartbeat, not that the subnet would switch to the standby NIC.

I don't want the subnet to switch, just SG to use the other subnet for heartbeat.

cjw
11 REPLIES 11
Sridhar Bhaskarla
Honored Contributor

Re: Should heartbeat subnet fail over?

Hi,

If there is no physical connectivity (bridging) between lan0 and lan2, serviceguard shouldn't consider lan2 as a standby interface to lan0.

Can you post your cluster configuration file (with dummy IPs) and the outputs of 'netstat -i', 'lanscan'.

This may be a stupid question to ask but can you try

linkloop -n 2 -i 0 MAC_Address_of_lan2 (low level ping from lan0 to lan2)
linkloop -n 2 -i 2 MAC_Address_of_lan0 (low level ping from lan2 to lan0)

and post the results?.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
melvyn burnard
Honored Contributor

Re: Should heartbeat subnet fail over?

run a cmscancl and post the contents of the /tmp/scancl.out file.
That way we can see what the binary thinks is the standby lan

You could also do a cmquerycl as a test:
cmquerycl -v -C /tmp/mytest.ascii -n node1 -n node2
see what the ascii file reports back as your physical network conectivity
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Mic V.
Esteemed Contributor

Re: Should heartbeat subnet fail over?

I agree with what's posted above; this doesn't seem normal or expected.

Did you determine that lan0 is not monitored by reading the cluster's SG documentation, by perusing the [ASCII] cluster config file, or by looking at the running cluster config? I'm curious to see the config as printed from the running cluster. Possibly what's actually running (from the binary file) is not the same as we think it is.

Mic
What kind of a name is 'Wolverine'?
Stephen Doud
Honored Contributor
Solution

Re: Should heartbeat subnet fail over?

cmviewconf reveals the content of the cluster binary file - which is the basis for the operations of your cluster.
The cmviewconf output shows the "bridged network" relationships that the cmquerycl and cmapplyconf commands discovered when the cluster binary was built.

From the syslog messages you gave, it appears that lan2 was discovered to be a standby LAN for lan0 when the cluster binary was created. The cmviewconf output will bear this out. Note that the "bridged net ID:" reference for both lan0 and lan2 have the same number - indicating they share the same physical network.

Since you indicated that lan0 is on a cross-over cable - it must have been put that way AFTER the cmapplyconf.

Since this is not what you want, you will have to recompile the cluster binary - but only after you verify the network connectivity you expect to see. As a troubleshooting aid, use cmquerycl to build a new cluster configuration file, and see how the new one matches up to the old one - particularly the comments after each node's section is listed. These comments describe what cmquerycl found for network connections/bridged networks.

-StephenD.
Rudy Williams
Regular Advisor

Re: Should heartbeat subnet fail over?

The output of cmviewconf shows that lan0, lan1 and lan2 have the same bridged net ID. This groks with reality--before the cluster was formed we discussed the use of a cross-over cable for heartbeat. However, when we began testing, we found no-one ever put one in place. All the NICs were on the same bridged network.

We now have both lan0 NICs connected with a cross-over cable.

Now that cabling is fixed: Do I have to run cmscancl again to fix this?

cjw
Sridhar Bhaskarla
Honored Contributor

Re: Should heartbeat subnet fail over?

Hi,

As Stephan indicated, you will need to re-apply the cluster configuration to update the binary. 'cmapplyconf -C /etc/cmcluster/cmclconfig.ascii' You may have to bring down the cluster before doing so.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
melvyn burnard
Honored Contributor

Re: Should heartbeat subnet fail over?

well as the network config is not the same, you will have to reapply the binary using cmapplyconf again.
It MAY allow you to apply this without halting the cluster.
Try it, if not it will tell you.
Once you have re-applied the binary, use cmviewconf, or run csmscancl again, to check the result is what you want.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rudy Williams
Regular Advisor

Re: Should heartbeat subnet fail over?

I verified that lan0 (heartbeat with crossover cable) can not send/receive Ethernet frames from lan1 and lan2 on other nodes. So we have no bridging. Good.

I then recompiled the binary, and ran cmviewconf again. Diff between the first comviewconf.out and second show no differences.

Then I ran cmquerycl again and found the following in the resulting file:

# Possible standby Network Interfaces for lan0,lan1: lan2.

NODE_NAME emroltp
NETWORK_INTERFACE lan0
HEARTBEAT_IP 192.168.244.21
NETWORK_INTERFACE lan2
NETWORK_INTERFACE lan1
HEARTBEAT_IP 89.0.23.130

It appears that SG still thinks that all 3 NICs are on a common bridged network.

I only want lan2 to be a standby NIC for lan1.

Would stopping and starting the cluster fix this?

cjw
melvyn burnard
Honored Contributor

Re: Should heartbeat subnet fail over?

cmquerycl -v -C ascifile -n node1 -n node2 is the syntax you used?
Cmquerycl will probe all hte networks on the specified noides, and attempt to verify which lan can talk to which lan, and hence decide which can act as abackup.

run hte cmqueyrycl and create a new test.ascii, post it here.
Also, as I suggested ., run a cmscancl and post that here as well.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rudy Williams
Regular Advisor

Re: Should heartbeat subnet fail over?

The cmquerycl -C test.ascii -n node1 -n node2 is attached and so a cmscancl.
melvyn burnard
Honored Contributor

Re: Should heartbeat subnet fail over?

well that new ascii file shows they are NOT connected:
NODE_NAME emrfovr
NETWORK_INTERFACE lan0
STATIONARY_IP 192.168.244.22
NETWORK_INTERFACE lan1
HEARTBEAT_IP 10.0.23.134
NETWORK_INTERFACE lan2

# Warning: There are no standby network interfaces for lan0.
# Possible standby Network Interfaces for lan1: lan2.

NODE_NAME emroltp
NETWORK_INTERFACE lan0
STATIONARY_IP 192.168.244.21
NETWORK_INTERFACE lan1
HEARTBEAT_IP 10.0.23.130
NETWORK_INTERFACE lan2


# Warning: There are no standby network interfaces for lan0.
# Possible standby Network Interfaces for lan1: lan2.


your cmscancl shows the binary still believes they ARE connected.
You will need to re-apply the binary to get this fixed.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!