1832617 Members
3516 Online
110043 Solutions
New Discussion

Re: serviceguard

 
SOLVED
Go to solution
sheevm
Regular Advisor

serviceguard

Hi,

I have a project in front of me to implement serviceguard in a production envirionment. I am brand new to this. I have been reading a lot of material and asking questions.

Still it is unclear to me how and where to implement the application, monitor start/stop procedure.

I see also in several package control scripts, volume group activation, fsck, mounting the filesystem, etc...

Basically the application I am trying to implement is currently running on single server

Total of 2 processes required to run this app.
one volume group in SAN

We have start/stop scripts. No monitoring scripts or tools.

Can someone give me basic steps of how to monitor these two processes in serviceguard, if any one of this process fails, shutdown the application start it in a second node.

I welcome any off-line help on this.

I am unable to attend the training.

Basically I need help to create package control script to implement the above steps.

I seek a serviceguard GURU's grace.

Thanks.
Raji
be good and do good
12 REPLIES 12
melvyn burnard
Honored Contributor

Re: serviceguard

Well other than reading the manuals and attending the training, the only thing I can advise here would be to look at paying someone to do this for you and learning while wathcing them do it.

also refer to your previous thread where an example was provided
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Chan 007
Honored Contributor

Re: serviceguard

Rajim,

Try using the Serviceguard Manager (GUI) to start with. Take a look.

Chan
Albert_31
Trusted Contributor

Re: serviceguard

Hello Raji,

It is pretty easy to achieve what you want to be done. The monitoring scripts are not rocket science, it is just a normal script which you need to code it in such a way that it monitors some parameters of the application which tells something about the health of the package.

After this you need to make the cluster/package aware of this script and also update teh cluster what needs to be done..on error.

Look for info on the below to determine how it works.
package control file
# SERVICE NAMES AND COMMANDS.
# For example:
# SERVICE_NAME[0]=pkg1a
# SERVICE_CMD[0]=â /usr/bin/X11/xclock -display 192.10.25.54:0â
# SERVICE_RESTART[0]=â â # Will not restart the service.

package configuration file
SERVICE_NAME service1
SERVICE_FAIL_FAST_ENABLED NO
SERVICE_HALT_TIMEOUT 300

Hope it gives you something to start with..
sheevm
Regular Advisor

Re: serviceguard

Thanks Albert.

I have another question.

I am designing volume groups and filesystem for Oracle db server this will be in a 2 node SG cluster Envirionment.

I read in the manaual to keep oracle binaries and executables on the local disk versus shared. This will eliminate the need for "fsck" operation on those filesystem during node switch in case of failure there by making the system to switch back faster.

I am sure many of you out there have done this. Can someone confirm this fact?

This must be the case with other application also correct?

Thanks.
Raji
be good and do good
Michael Steele_2
Honored Contributor

Re: serviceguard

Don't worry about it. SG is apart of the O/S and stored in local disks with the O/S to prevent single point of failures. (* Think of the O/S as HW. *) The O/S is kept on two nodes, server A and server B. These nodes each contain local disks while the data is kept on a shared disk array. These local disks are vg00 and you keep the O/S separate from the data for fail over purposes. This is the reason for having MC/ServiceGuard.
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: serviceguard

Link below contains a wide assortment of manuals on MC/ServiceGuard:

http://www.docs.hp.com/en/ha.html#ServiceGuard

The reason for going to service guard is to maintain a 7x24 environment with a goal of 100% uptime and 0% downtime.
Support Fatherhood - Stop Family Law
sheevm
Regular Advisor

Re: serviceguard

Michael,

I think my question is not clear.

I am just asking about oracle files. This is a database server. Which is a good place to keep oracle binaries? Local or Shared? in a serviceguard envirionment?

Thanks.

be good and do good
Michael Steele_2
Honored Contributor

Re: serviceguard

Refer to page 23, figure 1.1 for a good diagram of a SG configuration.

See attached.
Support Fatherhood - Stop Family Law
sheevm
Regular Advisor

Re: serviceguard

Michael,

I do not see the attachement

Thanks
be good and do good
Michael Steele_2
Honored Contributor

Re: serviceguard

Attachement didn't come through so I've provided the link for figure 1.1.

http://www.docs.hp.com/en/B3936-90073/B3936-90073.pdf
Support Fatherhood - Stop Family Law
Stephen Doud
Honored Contributor
Solution

Re: serviceguard

If you put Oracle program files on a shared disk for either server to access and the files become corrupt, neither server will be able to run Oracle!
We recommend that Oracle be installed on the private volume group (vg00). Redundant copies of software, like redundant hardware, can help avoid a bad day :)
PeterWolfe
Respected Contributor

Re: serviceguard

>If you put Oracle program files on a shared disk
>for either server to access and the files become
>corrupt, neither server will be able to run Oracle

But the same exact thing could be said of the Oracle
data disks themselves.... They are equally
likely to be corrupted and it doesn't
matter at that point where the Oracle
binaries are installed.

Also the multiple copies have a pretty stiff
administration cost. Installing
the N times (one time pain), patching N
times (N member pain * # of patches). So I
think it comes down to your HA sensitivity.
For absolute best HA, sure, multiple copies
of the binaries. However, there's a middle
ground where the single copy is acceptable with
the requisite savings in admin costs. For example,
OpenView supports both models for
some of their products.

It is true that in the single copy model,
if the cluster members that are not the adoptive node
need (remote) database access, they still need the Oracle client
bits. So we're back to N copies for a subset
of files....