Simpler Navigation for Servers and Operating Systems - Please Update Your Bookmarks
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
If you have bookmarked forums or discussion boards in Servers and Operating Systems, we suggest you check and update them as needed.
Showing results for 
Search instead for 
Did you mean: 

MC/Service Guard Questions

Go to solution
Krishna Prasad
Trusted Contributor

MC/Service Guard Questions

We have worked with Service Guard and know the basics of how it works. Have there been any major advancements in the ServiceGuard software over the last 5 - 10 years?

Here is a little explanation of a possible environment we are considering.

System 1 primary DB server running SAP with an Oracle DB. Database size 200 GB and growing.

Disk Subsystem allowing for shared storage.

System 2 backup DB server, however running as a SAP application server when the primary system is active. System #2 has the same # cpu's and same amount of memory as system 1.

Knowing that in the event of failover, System 2 must shutdown the application it is running, then the cluster will switch the packages over to system 2. There will also be a nfs package to failover.

In your expierence how long does it take to finish this entire process of detecting the failure, stopping the app on system 2, moving the volumes and packages, and recovering the Orcale DB, and starting SAP?
Positive Results requires Positive Thinking
A. Clay Stephenson
Acclaimed Contributor

Re: MC/Service Guard Questions

I don't know about SAP; we do Baan but for Baan/Oracle the failover is generally about 2 minutes. It really depends upon how corrupt Oracle is at the time of the crash and thus how long the recover operation takes. In real life, in over four years of production, I have yet to have a package failover. The whole idea is to make your systems so robust that normally LVM and APA handle the problems long before MC/SG "wakes up".

By the time Oracle has recovered, your NFS package will have long been up.
If it ain't broke, I can fix that.
malay boy
Trusted Contributor

Re: MC/Service Guard Questions

We have 2 server running MC Service Guard.Oracle and Customize network Management system(to monitor the GSM network).The total switch over around 2 minutes.Including shuting down oracle and application.

There are three person in my team-Me ,myself and I.
Massimo Bianchi
Honored Contributor

Re: MC/Service Guard Questions

maybe i'm a little unfortunate, but my switchover take about 10/15 minutes.

A clean and alone SAP takes about 5 minutes, including automatic recover from oracle, cleaning of SAP application server and new start. We never use the reconnect feature, that can help in minimizing this time, but we prefer a total shutdown and a clean startup in such cases.

Usually we have a lot of stuff togheter with SAP in HA: MQ series, Autosys, NFS package exporting sap nfs relates fs and some other apps fs. If you count 3/4 minutes for each, time is in line with others.

Bart Paulusse
Respected Contributor

Re: MC/Service Guard Questions

Over here it also takes about 15 minutes for M/C Serviceguard to shutdown the SAP and Oracle package (including nfs), shutdown the application server on the failover-database server and to have the 1200GB Oracle/SAP instance up and running on the failover-server.
Krishna Prasad
Trusted Contributor

Re: MC/Service Guard Questions

That fact that we have to stop an application on system 2 before the package can be moved, I think the SLA should be 20 min. at a minimal.

For those who replied two minutes how large is your Oracle Database?

Do you have a application that must be stoped on system 2?

Do you add extra resources ( memory ) to make sure the package has enough resources to start on system 2 without shutting down the application that normally runs on system 2?

Also is 2 min. when you do a manual move? Have you had a crash or tested the failover during a heavy load?

I know that if you plan the outage and shutdown the applications then move the package the 'failover' can be quicker.

I think the variable for a live system crash is how many users where working when the crash occurs and what they were doing. How much does Oracle have to roll back?

Positive Results requires Positive Thinking
Marco Santerre
Honored Contributor

Re: MC/Service Guard Questions

Over at our site, we do have a similar environment as the one you are planning to create.

Our System #1 does run our DB for SAP, and System #2 runs the CI for SAP and App_server #1. Whenever there is a failover, App_server #1 shuts down and the DB transfers over.

When Service Guard was implemented at our site, everything was made part of the Service Guard scripts. Actually if I'm not mistaken, not much customization was done as we both the latest MC/SG and SAP extension for MC/SG.

All in all, it does take about 20 minutes for the whole thing to failover. Granted, we haven't had a real catastrophe during peak hours for a very long time (especially since we actually implemented the new version about 1 1/2 year ago), but all tests, even crash tests (not manual failover) were all about 20 minutes and that included App_Server #1 shutting down.
Cooperation is doing with a smile what you have to do anyhow.