Operating System - HP-UX
1846418 Members
3141 Online
110256 Solutions
New Discussion

Re: Problems with a 2 node cluster

 
SOLVED
Go to solution
Gregg Palmer_2
Occasional Advisor

Problems with a 2 node cluster

hi, I am trying to set up a two node cluster. I have two packages set up, one running on each box. I can switch them back and forth witout a problem.
I have dual NIC cards in each system, and they switch fine between them when one cable is pulled. If I pull both ethernet cables though, the other box goes into a TOC and reboots. The box I pulled the cables from gets confused for a bit, and both packages go into an unowned state. Shouldn't they both failover to the other box, or am I missing something here?
Thanks for your comments.
greggster
11 REPLIES 11
Mel Burslan
Honored Contributor
Solution

Re: Problems with a 2 node cluster

depends on what your failover policy and quorum determination method is.

If you are using a lock disk and using an automatic failover policy, whichever one of the servers, obviously the one who does not TOC in your case, should grab the ownership of lock disk and take the packages over. If it is not doing this, check your package failover configurations and make sure they are enabled.

After box1 TOCs and reboots, can you manually activate the packages on box2 or is it not doing this as well ? If there is a problem with manul startup as well, you may have deeper issues with yout cluster configuration and physical layout of your resources.
________________________________
UNIX because I majored in cryptology...
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

The failback policy is set to MANUAL.
After the TOC, the box is able to come back up and, if I reconnect ethernet to the original system (boxA), I am able to manually join the cluster using cmrunnode, and then move packages back and forth again.
I would've thought that boxA would've failed/TOC'd as it's ethernet was pulled. i am using a lock disk, which is enabled on both boxes via vgchange -a y /de/vglock.
thanks again,
greggster
A. Clay Stephenson
Acclaimed Contributor

Re: Problems with a 2 node cluster

By the way, you are actually simulating the kind of failure that is never supposed to happen by yanking both cables. Good MC/SG design assumes that at most one switch, NIC, or cable fails and there is at least one more fully independent LAN connection available. What could be expected is the complete failure of a node and that is done not by yanking network cables but by yanking power cable(s).
If it ain't broke, I can fix that.
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

Good point--but these folks only have one router/hub, so i am stuck with the scenario, for now. I will mention your point though.

Gregg
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

i am stillnot certain why the other box is TOCing when this box gets it's ethernet pulled...
A. Clay Stephenson
Acclaimed Contributor

Re: Problems with a 2 node cluster

The TOC'ing is part of the split-brain avoidance strategy. All heartbeats have been lost (thanks to your yanking), the node that acquired the cluster lock stays up, and the other node needs to quickly get the heck out of Dodge -- hence the TOC.
If it ain't broke, I can fix that.
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

Thanks much.

Gregg
melvyn burnard
Honored Contributor

Re: Problems with a 2 node cluster

If you have no connectivity between nodes, i.e. you have pulled ALL heartbeat connections, then each node will try to connect and own the cluster lock mechanism. If this is a disc, and the node with NO connections gets it first, it is allowed to stay up and the other node is denied the lock, hence it TOC's .
This is normal SG operation.
As already mentioned, you have an MPOF which SG is not designed to handle.
A better design would be to have a further hearbeat lan between the ndoes.
You could also install Quorum Server software onto a thrid node outside the cluster, and use that instead of a cluster lock disc. In the above scenario, because the node would be unable to reach the QS via the network, itr would be forced to TOC and the opackages would move to the surviving node.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

Thanks for your note, it was very helpful...
Gregg
Andrew Merritt_2
Honored Contributor

Re: Problems with a 2 node cluster

Hi Gregg,
If the replies above have been helpful, please assign points to them so that others with similar questions can find the answers.

It also rewards those who give good information (some care more than others about the points, but it is dispiriting to take time to give a good answer to a problem and not see it recognised).

(No points for this response, please)

Andrew
Gregg Palmer_2
Occasional Advisor

Re: Problems with a 2 node cluster

Thanks to all who replied!

Gregg