Operating System - HP-UX
1834925 Members
2646 Online
110071 Solutions
New Discussion

Re: On Lan Failure, Package Switching Fail

 
SOLVED
Go to solution
Muhammad Ahmad
Frequent Advisor

On Lan Failure, Package Switching Fail

Dear All,

We have configured the 2-node service gaurd cluster with the configuration described below, we are facing trouble in one of our tests regarding lan connectivity.

We have 4 lan interfaces. (2 primary + 2 secondary) "the details are given below with the cluster configuration details"

The Lan Failure test details are:

When lan0 fail, traffic shifts to lan1 -> That's fine.
After then, when lan1 fail, traffic shifts to lan2 -> That's also fine.
After the failure of lan3, traffic shifts to lan4 and the package remains running on the same node. -> That's a problem, here the package should shifts to the available node in the cluster.

The shifting of traffic from lan0 to lan1 or lan2 or lan3 or lan4 may occur, but when there is only sinle lan left which was the only option for the heartbeat, package should be switched to the available node in the cluster, but it did'nt?

uname -a
HP-UX myserver B.11.23


The Cluster configuration with Lan configuration details are given below:



CLUSTER_NAME mycluster
FIRST_CLUSTER_LOCK_VG /dev/myvg

NODE_NAME nodeA
NETWORK_INTERFACE lan0
HEARTBEAT_IP 192.25.3.43
NETWORK_INTERFACE lan1
STATIONARY_IP 192.25.7.27
NETWORK_INTERFACE lan2
NETWORK_INTERFACE lan3
FIRST_CLUSTER_LOCK_PV /dev/dsk/c10t0d1

NODE_NAME nodeB
NETWORK_INTERFACE lan0
HEARTBEAT_IP 192.25.3.44
NETWORK_INTERFACE lan1
STATIONARY_IP 192.25.7.28
NETWORK_INTERFACE lan2
NETWORK_INTERFACE lan3
FIRST_CLUSTER_LOCK_PV /dev/dsk/c10t0d1

HEARTBEAT_INTERVAL 1000000
NODE_TIMEOUT 8000000
AUTO_START_TIMEOUT 600000000
NETWORK_POLLING_INTERVAL 2000000
NETWORK_FAILURE_DETECTION INOUT
MAX_CONFIGURED_PACKAGES 150

VOLUME_GROUP /dev/myvg
VOLUME_GROUP /dev/myvg2
VOLUME_GROUP /dev/myvg3

Any comments or suggestions will be highly oblidged.

Regards,


Muhammad Ahmad.
16 REPLIES 16
Steven E. Protter
Exalted Contributor

Re: On Lan Failure, Package Switching Fail

Shalom Muhammad,

Is this everything? It would appear that what is happening is what you configured. If you have a backup lan configured and the only failure is a lan card, the package should not shift nodes becasue in essence the ip address associated with lan3 moved to lan4

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Re: On Lan Failure, Package Switching Fail

Muhammad,

>> HEARTBEAT_IP 192.25.3.43
>> STATIONARY_IP 192.25.7.27
>> HEARTBEAT_IP 192.25.3.44
>> STATIONARY_IP 192.25.7.28


Are these IPs all on the same subnet? They shouldn't be... but your cluster behaviour implies they are - what subnet mask are you using on these interfaces?

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Eric SAUBIGNAC
Honored Contributor

Re: On Lan Failure, Package Switching Fail

Bonsoir Muhammad,

First I will say that I don't really understand your problem :



In a first reading I would say that if there is only one single lan left on a monitored subnet, the package is not supposed to swith onto an adoptive node.

But I would like to better understand your configuration and I have some questions ...

- You speak of lan0 to lan4. Your configuration only describes lan0 to lan3 ?

- On each node, you have two NIC with IP that could be in the same subnet. So what is the subnet mask of each IP ?

- For each package, which subnet(s) are monitored ? (Search each occurence of word SUBNET in package configuration file)

- I don't understand which NIC is supposed to be backup by wich one ... In your description you said that "When lan0 fail, traffic shifts to lan1". Yes but lan1 is not supposed to be a backup interface since it as its own IP. So could you be more precise and tell us excatly wich NIC is a backup for which one ?

Regards

Eric

Re: On Lan Failure, Package Switching Fail

Hello Muhammad,

Please send us a copy of the following information:

* /etc/rc.config.d/netconf
* the output from netstat -rn
* output from lanscan
* /etc/hosts
* complete cluster ascii file
* complete Package ascii file
* complete package control file

Once we have this we be able to see exactly how this cluster is configured; making suggestions will be much easier.

Thanks,

Tim
Asif Sharif
Honored Contributor

Re: On Lan Failure, Package Switching Fail

Asalam-O-Alaiakum,

Dear Ahmad,


Your scenario is a little bit confusing but what i understood is, Your package is not shifting due to "SUBNET" if that is not mentioned in your Package configuration file, Please check with this command.

#grep SUBNET /etc/cmcluster/PKGNAME/PKGNAME.conf

see the doc for Package Configuration Planning.

http://docs.hp.com/en/B5158-90024/ch04s09.html#cihcfjje

Subnet

Enter the IP subnets that are to be monitored for the package.In the ASCII package configuration file, this parameter is called SUBNET.

Please send the details which Tim asked.

Regards,
Asif Sharif

Regards,
Asif Sharif
Stephen Doud
Honored Contributor

Re: On Lan Failure, Package Switching Fail

Aside from the fact that two NICs appear to be on the same subnet (not healthy from what I understand), lan4 is not in your cluster configuration file, so Serviceguard will not use it.

Serviceguard will perform a package failover when a total network outage occurs if all of the below are true:
a) the SUBNET parameter is configured in the package configuration file
b) package AUTO_RUN and Node_switching are enabled for that package
c) all NICs that operate as standby NICs for the primary NIC have failed
Muhammad Ahmad
Frequent Advisor

Re: On Lan Failure, Package Switching Fail

Hi everybody,

Thanks a lot for your co-operation.

Asif, Eric, I checked the "SUBNET" in pkg configuration file, that was not there, after adding it and did the same test cases but same result.

Test case:When i unplug the LAN0, traffic shifts on LAN2, after unplugging LAN2 cable traffic shifts to LAN3 and when we unplug the LAN3, our nodeB got TOC.

What we want is to shift the PKG on other node not TOC.


Please help.

Regards,


Muhammad Ahmad.
Asif Sharif
Honored Contributor
Solution

Re: On Lan Failure, Package Switching Fail

Assalam-O-Alaiakum,

Ahmed, What is the SUBNET you defined in package configuration file?

Regards,
Asif Sharif
Regards,
Asif Sharif
Eric SAUBIGNAC
Honored Contributor

Re: On Lan Failure, Package Switching Fail

Bonjour,

I first must underline that we really miss your configuration files. Without them, we can just make hypothesis. So please post those asked by Timothy.

Among hypothesis ... you loose all NIC dedicated to heardbeat : lan0, lan2, lan3. So you have no more heartbeat between nodeA and nodeB. That is a situation where a race begins and the first node that will acquire CLUSTER_LOCK_PV will survive and the other node will TOC. So if nodeB TOC, I think it is normal. NodeA could have also TOC in place of nodeB.

But normally the package should have shift to nodeA. Did it ?

I would made several modifications to you configuration :

- assuming 192.25.7.0 is public network and 192.25.3.0 is dedicated heartbeat network, I would supress one NIC, for instance lan3, from hearbeat network and connect it to public network. What is your primary goal ? Protect users or heartbeat network ? I guess you want to have public network protected.

- You can have more than one heartbeat network. Simply modify STATIONARY_IP 192.25.7.X to HEARTBEAT_IP 192.25.7.X. 2 heartbeat networks is fine.

- Ensure that monitored network in the package is public network, not heartbeat. There is no particular meaning to monitor heartbeat.

And please, POST your configuration, and don't forget network configuration.

Regards, Eric
Muhammad Ahmad
Frequent Advisor

Re: On Lan Failure, Package Switching Fail

Hi Asif and Eric,

Thanks a lot for your quick responses,

Here is answer of your questions.

normally the package should have shift to nodeA. Did it ?
Yes.

- assuming 192.25.7.0 is public network and 192.25.3.0 is dedicated heartbeat network

no, 192.25.7.0 is dedicated heartbeat network and 192.25.7.0 is public network

The subnet details are:

Public Network:

subnet: 192.25.3.0
subnet mask: 255.255.255.0

Hearbeat:

subnet: 192.25.7.0
subnet mask: 255.255.255.0

------ Output of netstat -in NodeA ------

IPv4:
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan0:1 1500 192.25.3.0 192.25.3.46 7104 0 42 0 0
lan3* 1500 none none 0 0 0 0 0
lan2* 1500 none none 0 0 0 0 0
lan1 1500 192.25.7.0 192.25.7.27 11035 0 5806 0 0
lan0:2 1500 192.25.3.0 192.25.3.45 11149 0 215 0 0
lan0 1500 192.25.3.0 192.25.3.43 333528 0 279513 0 0
lo0 4136 127.0.0.0 127.0.0.1 226623 0 226623 0 0

Re: On Lan Failure, Package Switching Fail

Muhammad,

If 192.25.7.0 is a heartbeat network, why do you have it set as a 'STATIONARY_IP' in your cluster configuration? It should be HEARTBEAT_IP.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Asif Sharif
Honored Contributor

Re: On Lan Failure, Package Switching Fail

I strongly agree with Eric regarding TOC, Its normal behaviour. you can modify your cluster configuration as,

NETWORK_INTERFACE lan0
HEARTBEAT_IP 192.25.3.43
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.25.7.27

NETWORK_INTERFACE lan0
HEARTBEAT_IP 192.25.3.44
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.25.7.28

and after that

cmcheckconf
cmapplyconf

Regards,
Asif Sharif
Regards,
Asif Sharif
Muhammad Ahmad
Frequent Advisor

Re: On Lan Failure, Package Switching Fail


Thanks a lot for the quick responses.
Now, i am trying it and shortly will let you know.
Thanking you once again,
Regards,

Muhammad Ahmad
Eric SAUBIGNAC
Honored Contributor

Re: On Lan Failure, Package Switching Fail

Well, well, well.

I understand better why you are doing the test with lan0, lan2 and lan3. But, as said Duncan, there is a mistake on the definition of the heartbeat interface.

So I suggest you keep HEARTBEAT_IP on lan0 but you MUST move to HEARTBEAT_IP on lan1. (You could also move lan0 to STATIONARY_IP, but it is better to have 2 heartbeat than just one). In this case, if you remove lan0, lan2 then lan3 the package will move (if the subnet of lan0 is monitored by the package) and the node will keep alive since there is still heartbeat.

pulic and heartbeat networks are NOT in the same subnet, that's fine.

Regards

Eric
Muhammad Ahmad
Frequent Advisor

Re: On Lan Failure, Package Switching Fail

Assalam O Alaiqum & Hi,

Thanks a lot to all,
It works!!!

Excellent Support.
Special Thanks to Mr. Asif Sharif & Eric.


Regards,
Muhammad Ahmad
Frequent Advisor

Re: On Lan Failure, Package Switching Fail

Thank you,