Operating System - HP-UX
1826230 Members
4935 Online
109692 Solutions
New Discussion

How to write HA Scripts for package in Servicegaurd Cluster

 
SOLVED
Go to solution
B V B Kumar
Frequent Advisor

How to write HA Scripts for package in Servicegaurd Cluster

Dear All,

I have to install my application in HP-Servicegaurd cluster. Now i was hangup on HA Scripts for the package to configure. Can anybody help me, how to write scripts for automatic failover of package to failover node.

Regards
Kumar
2 REPLIES 2
Matti_Kurkela
Honored Contributor
Solution

Re: How to write HA Scripts for package in Servicegaurd Cluster

Follow the procedure described in the "Managing ServiceGuard" manual.

The "cmmakepkg" command will give you a script template, which you can then edit according to the needs of your application.

ServiceGuard needs to know only two things about your application:

- how to start it properly
- how to stop it (preferably in a controlled fashion)

After you provide these, the standard functions in the script template will handle the actual failover.

Any process that is still running on the package LVs after the "customer_defined_halt_cmds" section of the package control script is done will be mercilessly killed with "fuser -ck" by ServiceGuard. If ServiceGuard needs to do this, it is considered a failure in the package shutdown procedure - so your application shutdown procedure should not leave any application processes running in any situation.

In the same way, your application startup procedure should *not* assume that the previous application shutdown was done in a controlled fashion: maybe someone in the machine room tripped over the power cables of one node and caused the node to suddenly lose all power. If you know what must be done to recover the application in a situation like this, make the startup script detect the situation and recover automatically if at all possible.

MK
MK

Re: How to write HA Scripts for package in Servicegaurd Cluster



--------------------------------------------
Creating Startup/Shutdown Scripts

This oft-overlooked part of Serviceguard is really the most important component to get right! Well-crafted startup and shutdown scripts make for a well behaved and easy to manage package. There's no need to wait for the cluster to be built before starting work on these, as most of them can be tested in a single server configuration. These are the important points to remember:

- Start with the shutdown script. The point of the shutdown script is that it absolutely must stop the application within a specified time period, and ensure everything is put into a suitably clean state to ensure a restart - this should include stopping processes, removing lock files, and cleaning up IPC resources. The shutdown script should never be given the opportunity to hang, so if there are any components that are out of your control (such as application binaries which are used for shutting down the app) think very hard about introducing some kind of manual timeout. A shutdown script may look something like this:

1. Warn users/clients of impending shutdown.
2. Disconnect users/shutdown clients to ensure orderly shutdown. (e.g. sending HUP signals to vt100 based applications)
3. Issue standard graceful application shutdown command as background process
4. Monitor status of background shutdown process.
5. Terminate background shutdown task if not exiting within defined timeout.
6. If one is available proceed straight to a hard shutdown, otherwise just start using the terminate command to stop processes. Always start with a TERM signal before moving on to a TERMINATE signal.
7. Remove any IPC resources for the application. This can be difficult if you do not have all the necessary IPC keys or IDs - this is one good reason for running applications as different users, then you can just remove any IPC resources owned by a specific user.
8. Remove any lock files the application uses. Again this can be difficult if you can't identify all the lock files. A very helpful tool in these scenarios is lsof a piece of freeware used to list files that processes have open available here: http://hpux.connect.org.uk/hppd/hpux/Sysadmin/lsof-4.80/

- The startup script does not absolutely have to succeed (as you will have application monitors watching for this) but again it must not hang. Lots of checks and logging will help debug issues later.

- The scripts should be completely self-contained (source their own environment, do not rely on an application users .profile file etc.). Don't ever use "su - user -c xxx" in a shell script, always set up the environment as required and then run "su user -c xxx".
i. Note that on HP-UX 11iv1 and above there is an enhancement to the security mechanism of the su command which means that the environment variables LD_LIBRARY_PATH, SHLIB_PATH and LD_PRELOAD are not exported to the child process. There are two ways to get around this without resorting to using "su - user -c xxxx":
1. Just reset the environment variable, e.g:
su oracle -c "startora.sh"
becomes:
su oracle -c "export SHLIB_PATH=${SHLIB_PATH};startora.sh"
2. or install patch PHCO_27781. This introduces a parameter SU_KEEP_ENV_VARS in the file /etc/default/security. Adding the following line to /etc/default/security would make su behave like it did in 11.00 and 10.20:
SU_KEEP_ENV_VARS= LD_LIBRARY_PATH,SHLIB_PATH,LD_PRELOAD

- Basic logging should be to stdout, as ServiceGuard will redirect this to the package control log file. If a more detailed level of logging is required give some consideration to logging to a separate log file.

- All possible timeouts need to be added up to arrive at a suitable maximum startup and shutdown time for the package. Once you are happy with these values, you can set the RUN_SCRIPT_TIMEOUT and HALT_SCRIPT_TIMEOUT values in the package configuration files to a value higher than these.

- Test the startup and shutdown scripts exhaustively in isolation. Once part of the package, bugs in the startup and shutdown routines are harder to spot, and take longer to fix.
--------------------------------------------



Duncan


I am an HPE Employee
Accept or Kudo