1820638 Members
1955 Online
109626 Solutions
New Discussion юеВ

Ignore heartbeat failure

 
SOLVED
Go to solution
Kris Knigga
Advisor

Ignore heartbeat failure

I have a three-node and a two-node ServiceGuard cluster. My network folks are going to be doing a massive switch maintenance and I'm practically guaranteed to lose my heartbeat. They tell me they can't promise my redundant switches won't be rebooted at the same time.

Is there a way to tell ServiceGuard to hold tight and not TOC anything? Would disabling package switching on all packages be enough?
7 REPLIES 7
Steven E. Protter
Exalted Contributor
Solution

Re: Ignore heartbeat failure

Shalom,

Yes there is a way.

Halt the cluster before this badness happens.

I would propose something else though.

Sit there on console and watch this happen.

This is a great failover test.

If I were not able to watch this process, I would cmhaltnode the nodes or cmhalt the entire cluster.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Kris Knigga
Advisor

Re: Ignore heartbeat failure

Wouldn't halting the cluster shut down all packages?
Raj D.
Honored Contributor

Re: Ignore heartbeat failure


Krism


Depneding on your SG hertbeat configuration you can decide if the cluster can stay up during maintenance, if you have more lan and HEARTBEAT_IP configured , during loss HB lan cause the HB to failover to the other lan. Check HEARTBEAT_IP configuration in cluster.ascii file.
If do not have more HEARTBEAT_IP & if Heartbeat is not there the node will assume saftytimer expired and may cause TOC.



>Wouldn't halting the cluster shut down all packages?

- yes halting cluster it will shutdown all the packages and nodes as well.
- However manually doing it is better and there will be a visibilty what is happening to what packages,an are they shutting down proerly or not, & if the vgs are getting deactivated properly or not.






Cheers,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Michael Steele_2
Honored Contributor

Re: Ignore heartbeat failure

Hi

a) switch to a one node cluster
b) disable the failover procedure

Do you need the commands? See cmhaltnode, cmviewcl, and some other related commands first. Paste in what you find and I'll verify it.
Support Fatherhood - Stop Family Law
Kapil Jha
Honored Contributor

Re: Ignore heartbeat failure

Any which ways if you network team is not sure if they would be able to keep network alive, so it does not mean if ur cluster is running or not if network goes down.

So better be you take downtime from your side.

You can move all the packages to one node and halt the other mode so no need for heartbeat :)

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Torsten.
Acclaimed Contributor

Re: Ignore heartbeat failure

Hi Kris,

I agree with Michael Steele.

To prevent any case of a split brain situation, stop the other nodes and run your package on a single node cluster only during this maintenance.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Kris Knigga
Advisor

Re: Ignore heartbeat failure

Thank you all. I think the best course of action will be just to shut down all of the packages and then the cluster.