Operating System - HP-UX
1848573 Members
7047 Online
104033 Solutions
New Discussion

serviceguard node shutdown on multiple lan failure

 
Robert Meredith
Occasional Contributor

serviceguard node shutdown on multiple lan failure

Is there a way to stop serviceguard shutting down a node if all the lan cards (or networks) fail. ie we had a situation where all the network conneections went down and therefore the active node went down. We want the situation where it will just sit still and wait for the networks to come back.

Cheers,

ROB
4 REPLIES 4
Vincenzo Restuccia
Honored Contributor

Re: serviceguard node shutdown on multiple lan failure

One characteristic of SG is that without communication lan the package is down not creed is ways for delaying it.
melvyn burnard
Honored Contributor

Re: serviceguard node shutdown on multiple lan failure

This is not what SG was designed to do. What if the lans never come up again?
The idea is that a package will switch to a node that hopefully has networking still ok :-}
Why would you want a node with no comms to stay up??
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
A. Clay Stephenson
Acclaimed Contributor

Re: serviceguard node shutdown on multiple lan failure

There is no way to do what you want; the best you can do is increase the NODE_TIMEOUT value
in the cluster configuration file. You will probably also have to increase the HEARTBEAT_INTERVAL also unless this is a 2 node cluster in which case you might look into a serial heartbeat link. The downside is that increasing the NODE_TIMEOUT value will also increase failover times. It looks as though you should look into a more robust network.
If it ain't broke, I can fix that.
Peggy Fong
Respected Contributor

Re: serviceguard node shutdown on multiple lan failure

Robert,
The previous post is the one that should help you. We used to have a poor network and would get 8 or so different serviceguard nodes failing because of lost networks. We increased the NODE_TIMEOUT parameter (sorry I don't have the value - I think it was changed to 20000000 (ms) but not sure. It did solve our problem. We also had to increase the value per HP for some other reason. I know I'm not much help but wanted to confirm that Stephenson's response should help you out.
Pfong