Operating System - HP-UX
1833780 Members
2313 Online
110063 Solutions
New Discussion

SG cluster heartbeat is missed

 
Ajin_1
Valued Contributor

SG cluster heartbeat is missed


Dear Sir
What happens to a running package when a heartbeat is missed in 2 node cluster ...what is the scenario of running package..Please help

Regards

Ajin.S
Thanks & Regards
Ajin.S
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
2 REPLIES 2
Steven E. Protter
Exalted Contributor

Re: SG cluster heartbeat is missed

Shalom,

If heartbeat goes down, the two nodes try and race to control the lock device. This device is usually a disk in a two node cluster.

One node gets the disk and continues running, the other node does TOC Transfer of Control, a really hard boot.

Any packing running on the losing node will cease to function.

If the packages running on the losing node are configured to fail over to the winning node, they will fail over.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Stephen Doud
Honored Contributor

Re: SG cluster heartbeat is missed

If the cluster has two or more networks, you can configure both networks to carry heartbeat, to reduce the possibility of loss of heartbeat. Edit the cluster configuration file, and change all STATIONARY_IP references to HEARTBEAT_IP, and cmapplyconf the file.

Heartbeat is essential to keeping the cluster running as is. If a node fails to transmit/receive a heartbeat with another node for NODE_TIMEOUT period (see the cluster configuration ASCII file), each node next step is governed by the cluster arbitration protocol which states that if the heartbeat breakdown was between an equal number of nodes, each node contacts the cluster lock or quorum server device. The first one to reach it continues operating the cluster and the last one is forced to TOC(memory dump & reboot).
If the heartbeat breakdown was between a majority and minority of nodes (example: 3 or 5 nodes can still pass heartbeat) the nodes in a minority will be forced to TOC while the nodes in the majority will continue to operate in the cluster.
Any packages that were terminated when a node TOCs should be adopted by running nodes (if their AUTORUN and node_switching flags are set to 'enabled'

If only 2 of 5 nodes can still pass heartbeat, all nodes will TOC, because a majority of healthy nodes does not exist.