Operating System - HP-UX
1753954 Members
7832 Online
108811 Solutions
New Discussion юеВ

Re: service-guard best practice

 
SOLVED
Go to solution
Adam Noble
Super Advisor

service-guard best practice

Hi,

We are making some significant LAN changes at the weekend and our servers will lose network connectivity for a period of around 4 hours. It will also effect the heartbeats on our clusters. What do people think is the best way to manage this as I do now want service-guard to get itself into a state or start attempting to fail over services. Is there a way of taking our servers out of cluster control without actually having to take the applications down. I suppose I could simply stop the packages from being able to fail over but is this sufficient.
5 REPLIES 5
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: service-guard best practice

This is one of those "it depends" situations. Note that SG can handle the failure of any one thing without problem so if you can confine your network destruction to one thing at the time, then you can safely continue operation. Note: "One thing" includes the loss of an entire ethernet switch which would probably affect all the nodes. I have done this kind of work and left all packages running but one must be extremely careful. This assumes that you have multiple heartbeat networks, multiple swicthes, and multiple NIC's. By far, the safest approach is to halt each package and then halt the cluster but obviously there is downtime associated with the safer approach.
If it ain't broke, I can fix that.
Deoncia Grayson_1
Honored Contributor

Re: service-guard best practice

If your application is controlled by the cluster then you won't be able to bring the cluster down without halting your application. The best practice is to stop the package from attempting to failover while the outage is occurring and you can also bring the secondary node out of the cluster while its being affected and do a failover and do like wise. that way the outage will occur only during the failover and won't cause any unnecessary outages. worst case scenario, your cluster panics and attempt to failover and can't and it shoots itself in the head...
If no one ever took risks, Michelangelo would have painted the Sistine floor. -Neil Simon
Wouter Jagers
Honored Contributor

Re: service-guard best practice

Most SG clusters are set up using (at least) two separate network-switches. This way, network maintenance should not pose a problem: SG will fail over to the other switch and carry on happily, as long as maintenance is done on one switch at a time.

Should the LAN maintenance be really really drastic and your nodes wouldn't be able to communicate with each other on any channel, I assume they wouldn't be able to communicate with their users either. In that case I think you'd better down the cluster alltogether.

It would probably be useful to provide a little more information about your SG setup and about the upcoming LAN changes, so that our dear SG-gurus can advise you even better ;-)

Cheers,
Wout
an engineer's aim in a discussion is not to persuade, but to clarify.
Wouter Jagers
Honored Contributor

Re: service-guard best practice

Sorry if that looked weird, when I started the above post there were no replies yet.. I was out-typed :-)
an engineer's aim in a discussion is not to persuade, but to clarify.
Sundar_7
Honored Contributor

Re: service-guard best practice

Apart from what has been mentioned above, I would like to add one more thing - SUBNET MONITORING.

If your package has been configured with SUBNET MONITORING, then even if you disable the package from failover, it WILL halt the package if the monitored subnet goes down.

Unfortunately, there is no way to turn off the subnet monitoring online. You will need to bring the package down to disable this.

Ofcourse, a good HA design should be in such a way that there is SPOF.
Learn What to do ,How to do and more importantly When to do ?