not able to halt a node before unmounting some Filesystem

madhuchakkaravarthy · ‎10-30-2010

hi

ver 11i v3 , three node cluster.

when i in need to down the server, i was about to halt the package and the node.

i can halt the package.but when i try to halt the node it asking to unmount some of the filesystem.

but some other filesystem are unmounted while halting package.

attached the control log file,can explain me where the checklist had been added in control file.

breif explanation of control file???????

regards

MC

madhuchakkaravarthy · ‎10-30-2010

attachment of control file

Steven Schweda · ‎10-30-2010

> attachment of control file

".doc"? A Microsoft Word document may not be
the most convenient format for what (I'd
guess) began as plain text.

madhuchakkaravarthy · ‎10-30-2010

root @ TRSAP102 :/>bdf
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol3 2097152 322560 1760800 15% /
/dev/vg00/lvol1 2097152 606408 1479136 29% /stand
/dev/vg00/lvol8 12582912 7594544 4949528 61% /var
/dev/vg00/lvol7 8388608 3106600 5240808 37% /usr
/dev/vg00/lvol4 4194304 2095992 2082816 50% /tmp
/dev/vg210/lvolora 12566528 9541825 2835706 77% /oracle
/dev/vg00/lvol6 10485760 5638952 4808952 54% /opt
/dev/vg00/lvol5 6291456 803648 5444992 13% /home
DevFS 6 6 0 100% /dev/deviceFileSystem
/dev/vg217/lvoltrappDE
25149440 4642960 19224873 19% /fintrappDE
/dev/vg216/lvoltrappEUR
25149440 4406199 19446795 18% /fintrappEUR
/dev/vg215/lvoltrappAFRO
25149440 5165870 18735890 22% /fintrappAFRO
/dev/vg218/lvoltrappGB
25149440 4807095 19071011 20% /fintrappGB
/dev/vg555/lvolora10
12574720 10537405 1909998 85% /oracle10
10.59.119.250:/cdrom
3707680 3707680 0 100% /cdrom

root @ TRSAP102 :/>cmviewcl

CLUSTER STATUS
TRSAPCLU101 up

NODE STATUS STATE
TRSAP101 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRASIA up running enabled TRSAP101
TRMU up running enabled TRSAP101
TRNP up running enabled TRSAP101
TRIN up running enabled TRSAP101

NODE STATUS STATE
TRSAP103 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRUS up running enabled TRSAP103
TRUSCA up running enabled TRSAP103
TRCA up running enabled TRSAP103

NODE STATUS STATE
TRSAP102 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRDE up running enabled TRSAP102
TREUR up running enabled TRSAP102
TRAFRO up running enabled TRSAP102
TRGB up running enabled TRSAP102
root @ TRSAP102 :/>cmhaltpkg -v TRDE TREUR TRAFRO TRGB
Disabling automatic failover for failover packages to be halted.
Halting package TRDE
Successfully halted package TRDE
Halting package TREUR
Successfully halted package TREUR
Halting package TRAFRO
Successfully halted package TRAFRO
Halting package TRGB
The package script for TRGB failed with no restart. TRGB should not be restarted
Unable to halt package TRGB
Check the syslog and pkg log files for more detailed information
root @ TRSAP102 :/>cmviewcl

CLUSTER STATUS
TRSAPCLU101 up

NODE STATUS STATE
TRSAP101 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRASIA up running enabled TRSAP101
TRMU up running enabled TRSAP101
TRNP up running enabled TRSAP101
TRIN up running enabled TRSAP101

NODE STATUS STATE
TRSAP103 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRUS up running enabled TRSAP103
TRUSCA up running enabled TRSAP103
TRCA up running enabled TRSAP103

NODE STATUS STATE
TRSAP102 up running

UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE
TRDE down halted disabled unowned
TREUR down halted disabled unowned
TRAFRO down halted disabled unowned
TRGB down halted disabled unowned
root @ TRSAP102 :/>cmviewcl

CLUSTER STATUS
TRSAPCLU101 up

NODE STATUS STATE
TRSAP101 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRASIA up running enabled TRSAP101
TRMU up running enabled TRSAP101
TRNP up running enabled TRSAP101
TRIN up running enabled TRSAP101

NODE STATUS STATE
TRSAP103 up running

PACKAGE STATUS STATE AUTO_RUN NODE
TRUS up running enabled TRSAP103
TRUSCA up running enabled TRSAP103
TRCA up running enabled TRSAP103

NODE STATUS STATE
TRSAP102 up running

UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE
TRDE down halted disabled unowned
TREUR down halted disabled unowned
TRAFRO down halted disabled unowned
TRGB down halted disabled unowned
root @ TRSAP102 :/>cmhaltnode -v TRSAP102
Disabling all packages from starting on nodes to be halted.
Disabling all packages from running on TRSAP102.
Warning: Do not modify or enable packages until the halt operation is completed.
Unable to halt the LVM daemon on TRSAP102.
Clustered Volume Groups must be deactivated before the cluster can be halted.
Check the syslog file for more detailed information.

# bdf

/dev/vg218/lvoltrappGB
25149440 4805431 19072571 20% /fintrappGB
/dev/vg555/lvolora10
12574720 10537405 1909998 85% /oracle10
10.59.119.250:/cdrom
3707680 3707680 0 100% /cdrom
root @ TRSAP102 :/>umount /fintrappGB
root @ TRSAP102 :/>vgchange -a n /dev/vg218
Deactivated volume group in Exclusive Mode.
Volume group "/dev/vg218" has been successfully changed.

Control file

root @ TRSAP102 :/etc/cmcluster/TRGB>cat TRGB.cntl
# @(#) A.11.18.00 Date: 02/04/08 PHSS_37244 $
# **********************************************************************
# * *
# * HIGH AVAILABILITY PACKAGE CONTROL SCRIPT (template) *
# * *
# * Note: This file MUST be edited before it can be used. *
# * *
# **********************************************************************

# The environment variables PACKAGE, NODE, SG_PACKAGE,
# SG_NODE and SG_SCRIPT_LOG_FILE are set by Serviceguard
# at the time the control script is executed.
# Do not set these environment variables yourself!
# The package may fail to start or halt if the values for
# these environment variables are altered.

# NOTE: Starting from 11.17, all environment variables set by
# Serviceguard implicitly at the time the control script is
# executed will contain the prefix "SG_". Do not set any variable
# with the defined prefix, or the control script may not
# function as it should.

. ${SGCONFFILE:=/etc/cmcluster.conf}

# UNCOMMENT the variables as you set them.

# Set PATH to reference the appropriate directories.
PATH=$SGSBIN:/usr/bin:/usr/sbin:/etc:/bin

# VOLUME GROUP ACTIVATION:
# Specify the method of activation for volume groups.
# Leave the default (VGCHANGE="vgchange -a e") if you want volume
# groups activated in exclusive mode. This assumes the volume groups have
# been initialized with 'vgchange -c y' at the time of creation.
#
# Uncomment the first line (VGCHANGE="vgchange -a e -q n"), and comment
# out the default, if you want to activate volume groups in exclusive mode
# and ignore the disk quorum requirement. Since the disk quorum ensures
# the integrity of the LVM configuration, it is normally not advisable
# to override the quorum.
#
# Uncomment the second line (VGCHANGE="vgchange -a e -q n -s"), and comment
# out the default, if you want to activate volume groups in exclusive mode,
# ignore the disk quorum requirement, and disable the mirror
# resynchronization. Note it is normally not advisable to override the
# quorum.
#
# Uncomment the third line (VGCHANGE="vgchange -a s"), and comment
# out the default, if you want volume groups activated in shared mode.
# This assumes the volume groups have already been marked as sharable
# and a part of a Serviceguard cluster with 'vgchange -c y -S y'.
#
# Uncomment the fourth line (VGCHANGE="vgchange -a s -q n"), and comment
# out the default, if you want to activate volume groups in shared mode
# and ignore the disk quorum requirement. Note it is normally not
# advisable to override the quorum.
#
# Uncomment the fifth line (VGCHANGE="vgchange -a y") if you wish to
# use non-exclusive activation mode. Single node cluster configurations
# must use non-exclusive activation.
#
# VGCHANGE="vgchange -a e -q n"
# VGCHANGE="vgchange -a e -q n -s"
# VGCHANGE="vgchange -a s"
# VGCHANGE="vgchange -a s -q n"
# VGCHANGE="vgchange -a y"
VGCHANGE="vgchange -a e" # Default

# CVM DISK GROUP ACTIVATION:
# Specify the method of activation for CVM disk groups.
# Leave the default
# (CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=exclusivewrite")
# if you want disk groups activated in the exclusive write mode.
#
# Uncomment the first line
# (CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=readonly"),
# and comment out the default, if you want disk groups activated in
# the readonly mode.
#
# Uncomment the second line
# (CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=sharedread"),
# and comment out the default, if you want disk groups activated in the
# shared read mode.
#
# Uncomment the third line
# (CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=sharedwrite"),
# and comment out the default, if you want disk groups activated in the
# shared write mode.
#
# CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=readonly"
# CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=sharedread"
# CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=sharedwrite"
CVM_ACTIVATION_CMD="vxdg -g \$DiskGroup set activation=exclusivewrite"

# VOLUME GROUPS
# Specify which volume groups are used by this package. Uncomment VG[0]=""
# and fill in the name of your first volume group. You must begin with
# VG[0], and increment the list in sequence.
#
# For example, if this package uses your volume groups vg01 and vg02, enter:
# VG[0]=vg01
# VG[1]=vg02
#
# The volume group activation method is defined above. The filesystems
# associated with these volume groups are specified below.
#
VG[0]="vg218"

# CVM DISK GROUPS
# Specify which cvm disk groups are used by this package. Uncomment
# CVM_DG[0]="" and fill in the name of your first disk group. You must
# begin with CVM_DG[0], and increment the list in sequence.
#
# For example, if this package uses your disk groups dg01 and dg02, enter:
# CVM_DG[0]=dg01
# CVM_DG[1]=dg02
#
# The cvm disk group activation method is defined above. The filesystems
# associated with these volume groups are specified below in the CVM_*
# variables.
#
#CVM_DG[0]=""

# NOTE: Do not use CVM and VxVM disk group parameters to reference
# devices used by CFS (cluster file system). CFS resources are
# controlled by the Disk Group and Mount Multi-node packages.
#
# VxVM DISK GROUPS
# Specify which VxVM disk groups are used by this package. Uncomment
# VXVM_DG[0]="" and fill in the name of your first disk group. You must
# begin with VXVM_DG[0], and increment the list in sequence.
#
# For example, if this package uses your disk groups dg01 and dg02, enter:
# VXVM_DG[0]=dg01
# VXVM_DG[1]=dg02
#
# The cvm disk group activation method is defined above.
#
#VXVM_DG[0]=""

#
# NOTE: A package could have LVM volume groups, CVM disk groups and VxVM
# disk groups.
#
# NOTE: When VxVM is initialized it will store the hostname of the
# local node in its volboot file in a variable called 'hostid'.
# The Serviceguard package control scripts use both the values of
# the hostname(1m) command and the VxVM hostid. As a result
# the VxVM hostid should always match the value of the
# hostname(1m) command.
#
# If you modify the local host name after VxVM has been
# initialized and such that hostname(1m) does not equal uname -n,
# you need to use the vxdctl(1m) command to set the VxVM hostid
# field to the value of hostname(1m). Failure to do so will
# result in the package failing to start.

# VXVM DISK GROUP IMPORT RETRY
# For packages using VXVM disk groups, if the import of a VXVM
# disk group fails then this parameter allows you to specify if you want
# to retry the import of disk group. Setting this parameter to "YES" will
# run the following command "vxdisk scandisks" on the disks which belong to
# the failed disk group. This command can take a long time on a system which
# has a large IO subsystem. The use of this parameter is recommended in a
# Metrocluster with EMC SRDF environment.
# The legal values are "YES" and "NO". The default value is "NO"
VXVM_DG_RETRY="NO"

# VOLUME GROUP AND DISK GROUP DEACTIVATION RETRY COUNT
# Specify the number of deactivation retries for each disk group and volume
# group at package shutdown. The default is 2.
DEACTIVATION_RETRY_COUNT=2

# RAW DEVICES
# If you are using raw devices for your application, this parameter allows
# you to specify if you want to kill the processes that are accessing the
# raw devices at package halt time. If raw devices are still being accessed
# at package halt time, volume group or disk group deactivation can fail,
# causing the package halt to also fail. This problem usually happens when
# the application does not shut down properly.
# Note that if you are using Oracle's Cluster Ready Service, killing this
# service could cause the node to reboot.
# The legal values are "YES" and "NO". The default value is "NO".
# The value that is set for this parameter affects all raw devices associated
# with the LVM volume groups and CVM disk groups defined in the package.
KILL_PROCESSES_ACCESSING_RAW_DEVICES="NO"

# FILESYSTEMS
# Filesystems are defined as entries specifying the logical volume, the
# mount point, the mount, umount and fsck options and type of the file system.
# Each filesystem will be fsck'd prior to being mounted. The filesystems
# will be mounted in the order specified during package startup and will
# be unmounted in reverse order during package shutdown. Ensure that
# volume groups referenced by the logical volume definitions below are
# included in volume group definitions above.
#
# Specify the filesystems which are used by this package. Uncomment
# LV[0]=""; FS[0]=""; FS_MOUNT_OPT[0]=""; FS_UMOUNT_OPT[0]=""; FS_FSCK_OPT[0]=""
# FS_TYPE[0]="" and fill in the name of your first logical volume,
# filesystem, mount, umount and fsck options and filesystem type
# for the file system. You must begin with LV[0], FS[0],
# FS_MOUNT_OPT[0], FS_UMOUNT_OPT[0], FS_FSCK_OPT[0], FS_TYPE[0]
# and increment the list in sequence.
#
# Note: The FS_TYPE parameter lets you specify the type of filesystem to be
# mounted. Specifying a particular FS_TYPE will improve package failover time.
# The FSCK_OPT and FS_UMOUNT_OPT parameters can be used to include the
# -s option with the fsck and umount commands to improve performance for
# environments that use a large number of filesystems. (An example of a
# large environment is given below following the decription of the
# CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS parameter.)
#
# Example: If a package uses two JFS filesystems, pkg01a and pkg01b,
# which are mounted on LVM logical volumes lvol1 and lvol2 for read and
# write operation, you would enter the following:
# LV[0]=/dev/vg01/lvol1; FS[0]=/pkg01a; FS_MOUNT_OPT[0]="-o rw";
# FS_UMOUNT_OPT[0]=""; FS_FSCK_OPT[0]=""; FS_TYPE[0]="vxfs"
#
# LV[1]=/dev/vg01/lvol2; FS[1]=/pkg01b; FS_MOUNT_OPT[1]="-o rw"
# FS_UMOUNT_OPT[1]=""; FS_FSCK_OPT[1]=""; FS_TYPE[1]="vxfs"
#
LV[0]="/dev/vg218/lvoltrappGB"; FS[0]="/fintrappGB"; FS_MOUNT_OPT[0]="-o largefiles"; FS_UMOUNT_OPT[0]=""; FS_FSCK_OPT[0]=""; FS_TYPE[0]="vxfs"

#
# VOLUME RECOVERY
#
# When mirrored VxVM volumes are started during the package control
# bring up, if recovery is required the default behavior is for
# the package control script to wait until recovery has been
# completed.
#
# To allow mirror resynchronization to ocurr in parallel with
# the package startup, uncomment the line
# VXVOL="vxvol -g \$DiskGroup -o bg startall" and comment out the default.
#
# VXVOL="vxvol -g \$DiskGroup -o bg startall"
VXVOL="vxvol -g \$DiskGroup startall" # Default

# FILESYSTEM UNMOUNT COUNT
# Specify the number of unmount attempts for each filesystem during package
# shutdown. The default is set to 1.
FS_UMOUNT_COUNT=1

# FILESYSTEM MOUNT RETRY COUNT.
# Specify the number of mount retrys for each filesystem.
# The default is 0. During startup, if a mount point is busy
# and FS_MOUNT_RETRY_COUNT is 0, package startup will fail and
# the script will exit with 1. If a mount point is busy and
# FS_MOUNT_RETRY_COUNT is greater than 0, the script will attempt
# to kill the user responsible for the busy mount point
# and then mount the file system. It will attempt to kill user and
# retry mount, for the number of times specified in FS_MOUNT_RETRY_COUNT.
# If the mount still fails after this number of attempts, the script
# will exit with 1.
# NOTE: If the FS_MOUNT_RETRY_COUNT > 0, the script will execute
# "fuser -ku" to freeup busy mount point.
FS_MOUNT_RETRY_COUNT=0

#
# Configuring the concurrent operations below can be used to improve the
# performance for starting up or halting a package. The maximum value for
# each concurrent operation parameter is 1024. Set these values carefully.
# The performance could actually decrease if the values are set too high
# for the system resources available on your cluster nodes. Some examples
# of system resources that can affect the optimum number of concurrent
# operations are: number of CPUs, amount of available memory, the kernel
# configuration for nfile and nproc. In some cases, if you set the number
# of concurrent operations too high, the package may not be able to start
# or to halt. For example, if you set CONCURRENT_VGCHANGE_OPERATIONS=5
# and the node where the package is started has only one processor, then
# running concurrent volume group activations will not be beneficial.
# It is suggested that the number of concurrent operations be tuned
# carefully, increasing the values a little at a time and observing the
# effect on the performance, and the values should never be set to a value
# where the performance levels off or declines. Additionally, the values
# used should take into account the node with the least resources in the
# cluster, and how many other packages may be running on the node.
# For instance, if you tune the concurrent operations for a package so
# that it provides optimum performance for the package on a node while
# no other packages are running on that node, the package performance
# may be significantly reduced, or may even fail when other packages are
# already running on that node.
#
# CONCURRENT VGCHANGE OPERATIONS
# Specify the number of concurrent volume group activations or
# deactivations to allow during package startup or shutdown.
# Setting this value to an appropriate number may improve the performance
# while activating or deactivating a large number of volume groups in the
# package. If the specified value is less than 1, the script defaults it
# to 1 and proceeds with a warning message in the package control script
# logfile.
CONCURRENT_VGCHANGE_OPERATIONS=1

#
# USE MULTI-THREADED VGCHANGE
# Specify whether multi-threaded vgchange is to be used if available.
# 0 means that the multi-threaded option is not to be used and 1 means
# that the multi-threaded option is to be used. The default is set to 0.
# Multi-threaded vgchange has potential performance benefits.
# If the activation order of the paths defined in lvmtab is important then
# multi-threaded vgchange should not be used. If mirrored volume groups
# are synced during activation then using multi-threaded vgchange may
# worsen performance.
# Using the multi-threaded vgchange option can improve the activation
# performance of volume groups with multiple disks.
# CONCURRENT_VGCHANGE_OPERATIONS option is beneficial when mutiple
# volume groups need to be activated. To get the best performance for
# volume group activation, use the multi-threaded vgchange option in
# combination with the CONCURRENT_VGCHANGE_OPERATIONS option.
ENABLE_THREADED_VGCHANGE=0

# CONCURRENT FSCK OPERATIONS
# Specify the number of concurrent fsck to allow during package startup.
# Setting this value to an appropriate number may improve the performance
# while checking a large number of file systems in the package. If the
# specified value is less than 1, the script defaults it to 1 and proceeds
# with a warning message in the package control script logfile.
CONCURRENT_FSCK_OPERATIONS=1

# CONCURRENT MOUNT AND UMOUNT OPERATIONS
# Specify the number of concurrent mounts and umounts to allow during
# package startup or shutdown.
# Setting this value to an appropriate number may improve the performance
# while mounting or un-mounting a large number of file systems in the package.
# If the specified value is less than 1, the script defaults it to 1 and
# proceeds with a warning message in the package control script logfile.
CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS=1

# Example: If a package uses 50 JFS filesystems, pkg01aa through pkg01bx,
# which are mounted on the 50 logical volumes lvol1..lvol50 for read and write
# operation, you may enter the following:
#
# CONCURRENT_FSCK_OPERATIONS=50
# CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS=50
#
# LV[0]=/dev/vg01/lvol1; FS[0]=/pkg01aa; FS_MOUNT_OPT[0]="-o rw";
# FS_UMOUNT_OPT[0]="-s"; FS_FSCK_OPT[0]="-s"; FS_TYPE[0]="vxfs"
#
# LV[1]=/dev/vg01/lvol2; FS[1]=/pkg01ab; FS_MOUNT_OPT[1]="-o rw"
# FS_UMOUNT_OPT[1]="-s"; FS_FSCK_OPT[1]="-s"; FS_TYPE[0]="vxfs"
# : : :
# : : :
# : : :
# LV[49]=/dev/vg01/lvol50; FS[49]=/pkg01bx; FS_MOUNT_OPT[49]="-o rw"
# FS_UMOUNT_OPT[49]="-s"; FS_FSCK_OPT[49]="-s"; FS_TYPE[0]="vxfs"
#
# IP ADDRESSES
# Specify the IP and Subnet address pairs which are used by this package.
# You could specify IPv4 or IPv6 IP and subnet address pairs.
# Uncomment IP[0]="" and SUBNET[0]="" and fill in the name of your first
# IP and subnet address. You must begin with IP[0] and SUBNET[0] and
# increment the list in sequence.
#
# For example, if this package uses an IP of 192.10.25.12 and a subnet of
# 192.10.25.0 enter:
# IP[0]=192.10.25.12
# SUBNET[0]=192.10.25.0
# (netmask=255.255.255.0)
#
# Hint: Run "netstat -i" to see the available subnets in the Network field.
#
# For example, if this package uses an IPv6 IP of 2001::1/64
# The address prefix identifies the subnet as 2001::/64 which is an available
# subnet.
# enter:
# IP[0]=2001::1
# SUBNET[0]=2001::/64
# (netmask=ffff:ffff:ffff:ffff::)
# Alternatively the IPv6 IP/Subnet pair can be specified without the prefix
# for the IPv6 subnet.
# IP[0]=2001::1
# SUBNET[0]=2001::
# (netmask=ffff:ffff:ffff:ffff::)
#
# Hint: Run "netstat -i" to see the available IPv6 subnets by looking
# at the address prefixes
# IP/Subnet address pairs for each IP address you want to add to a subnet
# interface card. Must be set in pairs, even for IP addresses on the same
# subnet.
#
IP[0]="10.59.116.196"
SUBNET[0]="10.59.116.0"

# SERVICE NAMES AND COMMANDS.
# Specify the service name, command, and restart parameters which are
# used by this package. Uncomment SERVICE_NAME[0]="", SERVICE_CMD[0]="",
# SERVICE_RESTART[0]="" and fill in the name of the first service, command,
# and restart parameters. You must begin with SERVICE_NAME[0], SERVICE_CMD[0],
# and SERVICE_RESTART[0] and increment the list in sequence.
#
# For example:
# SERVICE_NAME[0]=pkg1a
# SERVICE_CMD[0]="/usr/bin/X11/xclock -display 192.10.25.54:0"
# SERVICE_RESTART[0]="" # Will not restart the service.
#
# SERVICE_NAME[1]=pkg1b
# SERVICE_CMD[1]="/usr/bin/X11/xload -display 192.10.25.54:0"
# SERVICE_RESTART[1]="-r 2" # Will restart the service twice.
#
# SERVICE_NAME[2]=pkg1c
# SERVICE_CMD[2]="/usr/sbin/ping"
# SERVICE_RESTART[2]="-R" # Will restart the service an infinite
# number of times.
#
# Note: No environmental variables will be passed to the command, this
# includes the PATH variable. Absolute path names are required for the
# service command definition. Default shell is /usr/bin/sh.
#
#SERVICE_NAME[0]=""
#SERVICE_CMD[0]=""
#SERVICE_RESTART[0]=""

# DEFERRED_RESOURCE NAME
# Specify the full path name of the 'DEFERRED' resources configured for
# this package. Uncomment DEFERRED_RESOURCE_NAME[0]="" and fill in the
# full path name of the resource.
#
#DEFERRED_RESOURCE_NAME[0]=""

# DTC manager information for each DTC.
# Example: DTC[0]=dtc_20
#DTC_NAME[0]=

# HA_NFS_SCRIPT_EXTENSION
# If the package uses HA NFS, this variable can be used to alter the
# name of the HA NFS script. If not set, the name of this script is
# assumed to be "hanfs.sh". If set, the "sh" portion of the default
# script name is replaced by the value of this variable. So if
# HA_NFS_SCRIPT_EXTENSION is set to "package1.sh", for example, the name
# of the HA NFS script becomes "hanfs.package1.sh". In any case,
# the HA NFS script must be placed in the same directory as the package
# control script. This allows multiple packages to be run out of the
# same directory, as needed by SGeSAP.
#HA_NFS_SCRIPT_EXTENSION=""

# Setting the log file
log_file=${SG_SCRIPT_LOG_FILE:-$0.log}

# START OF CUSTOMER DEFINED FUNCTIONS

# This function is a place holder for customer define functions.
# You should define all actions you want to happen here, before the service is
# started. You can create as many functions as you need.

function customer_defined_run_cmds
{
# ADD customer defined run commands.
: # do nothing instruction, because a function must contain some command.
#remsh TRGB -l ftadm /home/ftadm/bounce/FTUK/startappUK.sh
#remsh TRGB -l ftadm /fintrappGB/processes/ftuk/ftservers_uk start
test_return 51
}

# This function is a place holder for customer define functions.
# You should define all actions you want to happen here, after the service is
# halted.

function customer_defined_halt_cmds
{
# ADD customer defined halt commands.
: # do nothing instruction, because a function must contain some command.
#remsh TRGB -l ftadm /home/ftadm/bounce/FTUK/stopappUK.sh
#remsh TRGB -l ftadm /fintrappGB/processes/ftuk/ftservers_uk stop
test_return 52
}

# END OF CUSTOMER DEFINED FUNCTIONS

# START OF RUN FUNCTIONS

###############################################################
# This function checks for the existence of MetroCluster or
# ContinentalClusters packages that use physical data
# replication via Continuous Access XP on HP SureStore XP
# series disk arrays or SRDF on EMC Symmetrix disk arrays.
#
# If the /usr/sbin/DRCheckDiskStatus file exists in the system,
# then the cluster has at least one package which will be
# configured for remote data mirroring in a metropolitan or
# continental cluster.
#
# The function is called before attempting to activate the
# volume group. If no /usr/sbin/DRCheckDiskStatus file exists,
# the function does nothing.
#
###############################################################
#
function verify_physical_data_replication
{
if [[ -x /usr/sbin/DRCheckDiskStatus ]]
then
/usr/sbin/DRCheckDiskStatus "${0}" "${VGCHANGE}" "${CVM_ACTIVATION_CMD}" "${VG[*]}" "${CVM_DG[*]}" "${VXVM_DG[*]}"

exit_val=$?
if [[ $exit_val -ne 0 ]]
then
exit $exit_val
fi
fi
}

##############################################################
# This function tests whether the package is using HA NFS or
# not. If the HA NFS script file exists in the
# package directory then the package will be configured for
# use with HA NFS and the script will be executed.
#
# This function has one parameter passed to it, which then
# be passed to the hanfs.sh script:
#
# - start - to indicate the package is starting up
# - stop - to indicate the package is shutting down
#
###############################################################
#
HA_NFS_SCRIPT="${0%/*}/hanfs.${HA_NFS_SCRIPT_EXTENSION:-sh}"

function verify_ha_nfs
{

if [[ -x $HA_NFS_SCRIPT ]]
then

#
# The hanfs.sh script has the return values as follow:
#
# 0 - success
# 1 - fails
#

if [[ $1 = "start" ]]
then
$HA_NFS_SCRIPT $1
test_return 49
else
$HA_NFS_SCRIPT $1
test_return 50
fi
fi
}

##############################################################
# This function is used to migrate nfs file locks if that
# functionality is available on the system. The HA_NFS_SCRIPT
# is called with its lock migration option.
###############################################################
#
function ha_nfs_file_locks
{
if [[ -x $HA_NFS_SCRIPT ]]
then

$HA_NFS_SCRIPT file_lock_migration

#
# The hanfs.sh script has the return values as follow:
#
# 0 - success
# 1 - fails
#
test_return 53
fi
}

function activate_volume_group
{
integer index=0
integer i
integer j
set -A VGS ${VG[@]}
integer num_vgs=${#VGS[*]}

typeset pids_list
typeset volume_list

# Perform parallel volume group activations for better performance.
# Limit the number of parallel activations to specified
# CONCURRENT_VGCHANGE_OPERATIONS
while (( index < num_vgs ))
do
j=0
while (( j < CONCURRENT_VGCHANGE_OPERATIONS && index < num_vgs ))
do
I=${VGS[$index]}
if [[ "${VGCHANGE##*vgchange -a e}" != "${VGCHANGE}" ]]
then
print "$(date '+%b %e %X') - Node \"$(hostname)\": Activating volume group $I with exclusive option."
elif [[ "${VGCHANGE##*vgchange -a s}" != "${VGCHANGE}" ]]
then
print "$(date '+%b %e %X') - Node \"$(hostname)\": Activating volume group $I with shared option."
else
print "$(date '+%b %e %X') - \"$(hostname)\": Activating volume group $I with non-exclusive option."
fi
(
$VGCHANGE $I
) &
# save the process id and name of VG, used while checking exit status
pids_list[$j]="$!"
volume_list[$j]=$I
(( j = j + 1 ))
(( index = index + 1 ))
done

# wait for background vg activations to finish
# I is used by "test_return 1"
while (( j > 0 ))
do
pid=${pids_list[$j-1]}
I=${volume_list[$j-1]}
wait $pid
if (( $? != 0 ))
then
let 0
test_return 1
fi
(( j = j - 1 ))
done

# Check exit value (set if any proceeding vgchange calls failed)
if (( $exit_value == 1 ))
then
deactivate_volume_group
print "\n\t########### Node \"$(hostname)\": Package start failed at $(date) ###########"
exit 1
fi
done

# If the -s option has been specified, then we perform
# the resynchronization as a background task
#
if [[ ${VGCHANGE#*-s} != ${VGCHANGE} ]]
then
index=0
while (( index < num_vgs ))
do
I=${VGS[$index]}
(
if /sbin/vgsync $I
then
print "$(date '+%b %e %X') - Node \"$(hostname)\": Resynchronized volume group $I"
else
print "$(date '+%b %e %X') - Node \"$(hostname)\": Resynchronization of volume group $I encountered an error"
fi
) &
(( index = index + 1 ))
done
fi
}

function activate_disk_group
{
typeset -i retval
typeset DiskGroup

for I in ${CVM_DG[@]}
do
if [[ ${CVM_ACTIVATION_CMD} = *exclusivewrite ]]
then
print "$(date '+%b %e %X') - \"$(hostname)\": Activating disk group $I with exclusive option."
else
print "$(date '+%b %e %X') - Node \"$(hostname)\": Activating disk group $I with non-exclusive option."
fi

# The vxdg command requires a disk group parameter, specified
# with the -g option. This option preceeds the activation
# parameter. So the CVM_ACTIVATION_CMD command contains a
# variable (DiskGroup)
# that we will set now and evaluated at vxdg runtime. To do
# this we will now set the $DiskGroup variable and execute
# vxdg with the eval command.
DiskGroup=$I
eval $CVM_ACTIVATION_CMD
test_return 21

done

for I in ${VXVM_DG[@]}
do
print "$(date '+%b %e %X') - \"$(hostname)\": Importing disk group $I."

# If the disk group is already imported on this node,
# check_dg will return 10. Then we can skip the vxdg import
# since that would fail anyway.

check_dg $I
retval=$?
typeset -i cnt=0
typeset ret_str
typeset -i ret_val

if (( retval > 0 )) && (( retval != 10 )); then
let 0
test_return 22
fi
if (( retval != 10 )); then
vxdg -tfC import $I
fi

ret_val=$?

if [[ $VXVM_DG_RETRY = "YES" && $ret_val -ne 0 && $retval -ne 10 ]]; then
print "$(date '+%b %e %X') - \"$(hostname)\": Import of $I failed, trying again."

VXVM_DG_RETRY="NO"
ret_str=$(UNIX95=1 ps -C vxdisk -f | awk '
/vxdisk scandisks/ { if ("root" == $1) { print $1; exit } }')
while [[ $ret_str = "root" ]]
do
cnt=$cnt+1
if (( $cnt % 5 == 0 )); then
print "\n Waiting for the command vxdisk scandisks to be completed."
fi
sleep 2
ret_str=$(UNIX95=1 ps -C vxdisk -f | awk '
/vxdisk scandisks/ { if ("root" == $1) { print $1; exit } }')
done

if (( $cnt == 0 )); then
vxdisk scandisks
fi

check_dg $I
retval=$?

if (( retval > 0 )) && (( retval != 10 )); then
let 0
test_return 22
fi
if (( retval != 10 )); then
vxdg -tfC import $I
fi
fi

test_return 23

DiskGroup=$I
eval $VXVOL
test_return 24
done
}

########################################################################
# This function waits for the specified cvm disk group/volumes to
# enable. This is added for the defect JAGaf74126, where the packages are
# started before cvm disk groups get enabled.
#
########################################################################
function wait_for_cvm_dg_vols_enabled
{
typeset -i ret_val
typeset -i flag
typeset curr_kstate
typeset tmp_kstate

for I in ${CVM_DG[@]}
do
for k in `vxprint -v -q -g $I | awk '{print $2}'`
do
flag=1
tmp_kstate=DISABLED
while [[ $tmp_kstate = "DISABLED" ]]
do
curr_kstate=$(/usr/sbin/vxprint -g $I -F %kstate $k)
ret_val=$?
if (( $ret_val == 0 ))
then
if [[ $curr_kstate = "DISABLED" ]]
then
if (( $flag == 1 ))
then
print "$(date '+%b %e %X') - $k in $I is $curr_kstate, waiting for it to be ENABLED."
flag=0
fi
sleep 5
else
tmp_kstate=$curr_kstate
fi
else
let 0
test_return 7
fi
done
done
done
}

#################################
# This function will exits with
# an error if the volume group
# is already activated by another
# node.
##################################
function check_dg
{
typeset -i retval=0
vg=$1
hostid=""
disk=""

vxdisk -s list > /dev/null 2>&1
if [ $? -ne 0 ] ; then
print "check_dg: Error in vxdisk"
return 1
fi

#get the disk

disk=$(vxdisk -s list | awk -v vg=$1 '
/^Disk:/ { disk = $2 }
/^dgname:/ { if (vg == $2) { print disk; exit } }')

if [ "$disk" = "" ] ; then
print "check_dg: Error $vg does not have any disk"
return 1
fi

#get the hostids in the vg

hostid=$(vxdisk list $disk | awk '/^hostid:/ { print $2; exit }')

#get the hostname of the node
host=$(hostname)
if [ $? -ne 0 ] ; then
print "check_dg: Error in getting the hostname"
return 1
fi

#check hostid
if [ "$hostid" != "" ] ; then

status=""

if [ $host != $hostid ] ; then

#get status of node
status=$(cmviewcl -n $hostid | sed -n "s/^ $hostid[ ]*[a-z]*[ ]*//p")
if [ -z "$status" ] ; then
#try again
status=$(cmviewcl -n $hostid | sed -n "s/^ $hostid[ ]*[a-z]*[ ]*//p")
if [ -z "$status" ] ; then
#give up
print "check_dg: Error in cmviewcl"
return 1
fi
fi

case $status in
*ailed*)
print "check_dg: $hostid status failed..okay to activate $vg"
;;
*) cmviewcl
print "check_dg: Error $vg may still be imported on $hostid"
print ""
print "To correct this situation, logon to \"$hostid\" and "
print "execute the following command:"
print " vxdg deport $vg"
print ""
print "Once \"$vg\" has been deported from \"$hostid\", "
print "this package may be restarted via either cmmodpkg(1M)"
print "or cmrunpkg(1M)."
print ""
print "In the event that \"$hostid\" is either powered off"
print "or unable to boot, then \"$vg\" must be force "
print "imported. "
print ""
print "******************* WARNING **************************"
print ""
print "The use of force import can lead to data corruption if"
print "\"$hostid\" is still running and has \"$vg\""
print "imported. It is imperitive to positively determine that"
print "\"$hostid\" is not running prior to performing the force"
print "import. See \"-C\" option on vxdg(1M)."
print ""
print "*******************************************************"
print ""
print "To force import \"$vg\", execute the following"
print "commands on the local system:"
print " vxdg -tfC import $vg"
print " vxdg deport $vg"
print ""
return 2;;
esac

else
retval=10
fi
#checking one hostid is enough
break;

fi
return $retval
}

function ps_tree {
typeset pid=$1
typeset ppid
typeset cmd
typeset err=0
typeset prefix=""
typeset spaces=""
while (( pid != 1 && pid != 0 && err == 0))
do
cmd=
UNIX95=1 ps -p $pid -oppid= -oargs= 2>/dev/null | read ppid cmd
err=$?
if (( err == 0 ))
then
echo "${prefix}$pid: $cmd"
(( pid = ppid ))
else
echo "${prefix}$pid: "
fi
spaces="${spaces} "
prefix="${spaces}child of "
done
}

function show_users {
typeset dev=$1
[[ -z "$dev" ]] && return
typeset pids=$(fuser $dev 2> /dev/null)
[[ -z "$pids" ]] && return
echo "$dev in use by:"
typeset pid
for pid in $pids
do
ps_tree $pid
done
}

#This function is used to kill the user to freeup a mountpoint
#that could be busy and then do the mount operation.
#freeup_busy_mountpoint_and_mount_fs(x, y, z)
# x = Logical volume group to be mounted.
# y = File System where the logical volume is to be mounted.
# z = Mount Options to be used for mount operation
#
function freeup_busy_mountpoint_and_mount_fs
{

typeset vol_to_mount
typeset mount_pt
typeset fs_mount_opt

vol_to_mount=$1
mount_pt=$2
shift 2
fs_mount_opt=$*

print "\tWARNING: Running fuser on ${mount_pt} to remove anyone using the busy mount point directly."
UM_COUNT=0
RET=1

# The control script exits, if the mount failed after
# retrying FS_MOUNT_RETRY_COUNT times.

while (( $UM_COUNT < $FS_MOUNT_RETRY_COUNT && $RET != 0 ))
do
(( UM_COUNT = $UM_COUNT + 1 ))
fuser -ku ${mount_pt}
if (($UM_COUNT == $FS_MOUNT_RETRY_COUNT))
then
mount ${fs_mount_opt} ${vol_to_mount} ${mount_pt}
(( RET = $? ))
if(( $RET != 0 ))
then
print "\tERROR: Function freeup_busy_mountpoint_and_mount_fs"
print "\tERROR: Failed to mount ${vol_to_mount} to ${mount_pt}"
break
fi
else
mount ${fs_mount_opt} ${vol_to_mount} ${mount_pt}
(( RET = $? ))
sleep 1
fi
done
return $RET
}

function check_vxvm_vol_available
{
typeset volpath
volpath=$1

VOL=${volpath##*/}
TMP=${volpath%/*}
DG=${TMP##*/}
vol_kstate=$(vxprint -g $DG -F %kstate $VOL)
TMP=$?
if(( $TMP != 0 ))
then
print "\tERROR: Function check_vxvm_vol_available"
print "\tERROR: vxprint -g $DG -F %kstate $VOL"
print "\tERROR: Failed to get KSTATE for \"${volpath}\" See vxintro(1M) for EXIT CODE of $TMP."
exit_value=1
return 0
fi

if [[ $vol_kstate = "ENABLED" ]]
then
return 0
fi
print "$(date '+%b %e %X') - Node \"$(hostname)\": KSTATE for \"${volpath}\" is ${vol_kstate}."
return 1
}

# For each {file system/logical volume} pair, fsck the file system
# and mount it.
# If the mount point is busy and if FS_MOUNT_RETRY_COUNT = 0,
# mounting of the file system will fail and the control script
# will exit with an error.
#
function check_and_mount
{

typeset pids_list
integer NOT_READY=1

while (( $NOT_READY != 0 ))
do
NOT_READY=0
integer R=0
for I in ${LV[@]}
do
if [[ $(mount -p | awk '$1 == "'$I'"') = "" ]]
then
case $I in
*/dev/vx/dsk*)
check_vxvm_vol_available $I
if(( $? != 0 ))
then
NOT_READY=1
else
RLV[$R]=$(print $I | sed -e 's/dsk/rdsk/')
fi
;;
*)
RLV[$R]="${I%/*}/r${I##*/}"
;;
esac

(( R = $R + 1 ))
fi
done
if(( $NOT_READY != 0 ))
then
sleep 5
fi
done

# Verify that there is at least one file system to check.
if [[ "$exit_value" != 1 && ${RLV[@]} != "" ]]
then
print -n "$(date '+%b %e %X') - Node \"$(hostname)\": "
print "Checking filesystems:"
print ${LV[@]} | tr ' ' '\012' | sed -e 's/^/ /'

# Perform parallel fsck's for better performance.
# Limit the number of concurrent fsck's to CONCURRENT_FSCK_OPERATIONS
R=0
while (( R < ${#RLV[*]} ))
do
j=0
while (( j < CONCURRENT_FSCK_OPERATIONS && R < ${#RLV[*]} ))
do

( case ${FS_TYPE[$R]} in

hfs) fsck -F ${FS_TYPE[$R]} ${FS_FSCK_OPT[$R]} -P ${RLV[$R]}
;;

vxfs) fsck -F ${FS_TYPE[$R]} ${FS_FSCK_OPT[$R]} -p -y ${RLV[$R]}
;;

unk*) fsck ${FS_FSCK_OPT[$R]} ${RLV[$R]}
;;

*) if [[ ${FS_TYPE[$R]} = "" ]]
then
fsck ${FS_FSCK_OPT[$R]} ${RLV[$R]}
else
fsck -F ${FS_TYPE[$R]} ${FS_FSCK_OPT[$R]} ${RLV[$R]}
fi
;;

esac ) &

# save the process id for monitoring the status
pids_list[$j]="$!"
(( j = j + 1 ))
(( R = R + 1 ))
done

# wait for background fsck's to finish
while (( j > 0 ))
do
pid=${pids_list[$j-1]}
wait $pid
if (( $? != 0 ))
then
let 0
test_return 2
fi
(( j = j - 1 ))
done
done

# Check exit value (set if any preceeding fsck calls failed)
if (( $exit_value == 1 ))
then
deactivate_volume_group
deactivate_disk_group
print "\n\t########### Node \"$(hostname)\": Package start failed at $(date) ###########"
exit 1
fi
fi

integer F=0
integer j
set -A LogicalVolumes ${LV[@]}
integer L=${#LogicalVolumes[*]}

typeset volume_list

while (( F < L ))
do
j=0
while (( j < CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS && F < L ))
do
I=${LogicalVolumes[$F]}
if [[ $(mount | grep -e $I" ") = "" ]]
then
print "$(date '+%b %e %X') - Node \"$(hostname)\": Mounting $I at ${FS[$F]}"

case ${FS_TYPE[$F]} in

unk*) # Don't alter fsck/mount options.
;;

*) if [[ ${FS_TYPE[$F]} != "" ]]
then
FS_MOUNT_OPT[$F]="-F ${FS_TYPE[$F]} ${FS_MOUNT_OPT[$F]}"
fi
;;
esac

# Perform parallel file system mounts for better performance.
# Limit the number of parallel mounts to
# CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS

# if there is permission to kill the user, we can
# run fuser to kill the user, on the mount point.
# This would freeup the mount point, if it is busy

if (( $FS_MOUNT_RETRY_COUNT > 0 ))
then
(
mount ${FS_MOUNT_OPT[$F]} $I ${FS[$F]}
if (( $? != 0 ))
then
freeup_busy_mountpoint_and_mount_fs \
$I ${FS[$F]} ${FS_MOUNT_OPT[$F]}
(( RET = $? ))
return $RET
fi
) &
else
(
mount ${FS_MOUNT_OPT[$F]} $I ${FS[$F]}
) &
fi

# save the process id and volume name for monitoring status
pids_list[$j]="$!"
volume_list[$j]=$I
else
print "$(date '+%b %e %X') - Node \"$(hostname)\": WARNING: File system \"${FS[$F]}\" was already mounted."
fi
(( j = j + 1 ))
(( F = F + 1 ))
done

# wait for background mounts to finish
# I is used by "test_return 3"
while (( j > 0 ))
do
pid=${pids_list[$j-1]}
I=${volume_list[$j-1]}
wait $pid
if (( $? != 0 ))
then
let 0
test_return 3
fi
(( j = j - 1 ))
done

# Check exit value (set if any preceeding mount calls failed)
if (( $exit_value == 1 ))
then
umount_fs
deactivate_volume_group
deactivate_disk_group
print "\n\t########### Node \"$(hostname)\": Package start failed at $(date) ###########"
exit 1
fi
done
}

function retry_print
{

if [[ $(echo $1 | grep "Retrying") != "" ]]
then
print "$1" >> $log_file
fi

}

# For each {IP address/subnet} pair, add the IP address to the subnet
# using cmmodnet(1m).

function add_ip_address
{
integer S=0
integer error=0

for I in ${IP[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Adding IP address $I to subnet ${SUBNET[$S]}"
XX=$( cmmodnet -a -i $I ${SUBNET[$S]} 2>&1 )
if (( $? != 0 ))
then
if [[ $(echo $XX | grep "heartbeat IP") != "" ]]
then
# IP has been configured as a heartbeat IP address.
print "$XX" >> $log_file
(( error = 1 ))
else

# Look for the IP address in the 4th column of netstat -in output.
# If "IPv6:" keyword is present in the output, start looking for it
# in the 3rd column since IPv6 addresses are shown in the 3rd column.

YY=$(netstat -in | sed 's,/[0-9][0-9]*,,' |
awk -v f=4 '$f == "'${I%/*}'" { print } $1 == "IPv6:"{ f-- }')
if [[ -z $YY ]]
then
print "$XX" >> $log_file
print "\tERROR: Failed to add IP $I to subnet ${SUBNET[$S]}"
(( error = 1 ))
else
retry_print "$XX"
print "\tWARNING: IP $I is already configured on the subnet ${SUBNET[$S]}"
fi
fi
else
retry_print "$XX"
fi
(( S = $S + 1 ))
done

if (( error != 0 ))
then

# `let 0` is used to set the value of $? to 1. The function test_return
# requires $? to be set to 1 if it has to print error message.

let 0
test_return 4
fi

}

# Own and reset the DTC connections

function get_ownership_dtc
{
for I in ${DTC_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Assigning Ownership of the DTC $I"
dtcmodifyconfs -o $I
test_return 5

for J in ${IP[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Resetting the DTC connections to IP address $J"
dtcdiag -Q $J -q -f $I
test_return 6
done
done
}

# For each {service name/service command string} pair, start the
# service command string at the service name using cmrunserv(1m).

function start_services
{
integer C=0
for I in ${SERVICE_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Starting service $I using"
print " \"${SERVICE_CMD[$C]}\""
#
# Check if cmrunserv should be called the old
# way without a restart count.
#
if [[ "${SERVICE_RESTART[$C]}" = "" ]]
then
cmrunserv $I ">> $log_file 2>&1 ${SERVICE_CMD[$C]}"
else
#
# Do not attempt to restart when SERVICE_CMD does not exists or
# does not have executable permission(JAGag12644)
# Extract the service script name first from the SERVICE_CMD
# and then check for the execute permission
#
if [[ -d ${SERVICE_CMD[$C]%% *} ]] || [[ ! -x ${SERVICE_CMD[$C]%% *} ]]
then
print "\tWARNING: Service command ${SERVICE_CMD[$C]%% *} does not exist or is not executable."
print "\tWARNING: Ignoring configured SERVICE_RESTART value, and starting the service without any restart."

cmrunserv $I ">> $log_file 2>&1 ${SERVICE_CMD[$C]}"
else
cmrunserv ${SERVICE_RESTART[$C]} $I ">> $log_file 2>&1 ${SERVICE_CMD[$C]}"
fi
fi
test_return 8
(( C = $C + 1 ))
done
}

# For each {deferred resource name}, start resource monitoring for this
# resource using cmstartres(1m).

function start_resources
{
for I in ${DEFERRED_RESOURCE_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Starting resource monitoring for $I"
cmstartres -u -p $PACKAGE $I >> $log_file 2>&1
test_return 15
done
}

# END OF RUN FUNCTIONS.

# START OF HALT FUNCTIONS

# For each {deferred resource name}, stop resource monitoring for this
# resource using cmstopres(1m).

function stop_resources
{
for I in ${DEFERRED_RESOURCE_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Stopping resource monitoring for $I"
cmstopres -p $PACKAGE $I >> $log_file 2>&1
test_return 16
done
}

# Halt each service using cmhaltserv(1m).

function halt_services
{
for I in ${SERVICE_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Halting service $I"
cmhaltserv $I
test_return 9
done
}

# Disown the DTC.

function disown_dtc
{
for I in ${DTC_NAME[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Disowning the DTC $I"
dtcmodifyconfs -d $I
test_return 11
done
}

# For each IP address/subnet pair, remove the IP address from the subnet
# using cmmodnet(1m).

function remove_ip_address
{
integer S=0

for I in ${IP[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Remove IP address $I from subnet ${SUBNET[$S]}"
XX=$( cmmodnet -r -i $I ${SUBNET[$S]} 2>&1 )
if (( $? != 0 ))
then
if [[ $(echo $XX | grep "is not configured on the subnet") != "" ]]
then
print "$XX" >> $log_file
# `let 0` is used to set the value of $? to 1.
# The function test_return requires $? to be set
# to 1 if it has to print error message.
let 0
test_return 12
fi
else
retry_print "$XX"
fi
(( S = $S + 1 ))
done
}

# Unmount each logical volume.

function umount_fs
{
integer UM_CNT=${FS_UMOUNT_COUNT:-1}
integer ret
integer j

set -A LogicalVolumes ${LV[@]}

if [[ $UM_CNT < 1 ]]
then
UM_CNT=1
fi

integer L=${#LogicalVolumes[*]}

# Perform parallel file system umounts for better performance.
# Limit the number of parallel umounts to CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS

typeset mounttab=$(mount)

typeset pids_list
typeset volume_list

while (( L > 0 ))
do
j=0
while (( j < CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS && L > 0 ))
do
(( L = L - 1 ))
I=${LogicalVolumes[$L]}
echo $mounttab | grep -e $I" " > /dev/null 2>&1
if (( $? == 0 ))
then
print "$(date '+%b %e %X') - Node \"$(hostname)\": Unmounting filesystem on $I"
(
typeset users=$(show_users $I)
umount ${FS_UMOUNT_OPT[$L]} $I; ret=$?
if (( ret != 0 ))
then
print "$users"
print "\tWARNING: Running fuser to remove anyone using the file system directly."
fi

UM_COUNT=$UM_CNT
while (( ret != 0 && UM_COUNT > 0 ))
do
fuser -ku $I
umount ${FS_UMOUNT_OPT[$L]} $I; ret=$?
if (( ret != 0 ))
then
(( UM_COUNT = $UM_COUNT - 1 ))
if (( $UM_COUNT > 0 ))
then
print "\t$(date '+%b %e %X') - Unmount $I failed, trying again."
sleep 1
fi
fi
done
return $ret
) &

# save the process id and name of logical volume to be used later
# while checking the exit status
pids_list[$j]="$!"
volume_list[$j]=$I
(( j = j + 1 ))
fi
done

# wait for background umount processes to finish
# I is used by "test_return 13"
while (( j > 0 ))
do
pid=${pids_list[$j-1]}
I=${volume_list[$j-1]}
wait $pid
if (( $? != 0 ))
then
let 0
test_return 13
fi
(( j = j - 1 ))
done
done
}

function deactivate_volume_group
{
# Perform multiple volume group deactivations at same time.
# Limit the number of concurrent deactivations to CONCURRENT_VGCHANGE_OPERATIONS

integer index=0
integer j
integer ret
integer num_retries=0
set -A VGS ${VG[@]}
integer num_vgs=${#VGS[*]}

typeset pids_list
typeset volume_list

while (( index < num_vgs ))
do
j=0
while (( j < CONCURRENT_VGCHANGE_OPERATIONS && index < num_vgs ))
do
I=${VGS[$index]}
print "$(date '+%b %e %X') - Node \"$(hostname)\": Deactivating volume group $I"
(
vgchange -a n $I; ret=$?
while (( ret != 0 && num_retries < DEACTIVATION_RETRY_COUNT ))
do
print "\t$(date '+%b %e %X') - vgchange -a n $I failed, trying again."
if [[ $KILL_PROCESSES_ACCESSING_RAW_DEVICES = "YES" ]]
then
print "\tWARNING: Running fuser to remove anyone using the raw device directly."
find /dev/${I##*/} -type c \! -name group | xargs fuser -ku
fi
sleep 1
vgchange -a n $I; ret=$?
(( num_retries = num_retries + 1 ))
done
return $ret
) &
# save the process id and name of VG, used while checking the exit status
pids_list[$j]="$!"
volume_list[$j]=$I
(( j = j + 1 ))
(( index = index + 1 ))
done

# wait for background vg deactivations to finish
# I is used by "test_return 14"
while (( j > 0 ))
do
pid=${pids_list[$j-1]}
I=${volume_list[$j-1]}
wait $pid
if (( $? != 0 ))
then
let 0
test_return 14
fi
(( j = j - 1 ))
done
done
}

function dg_fuser
{
typeset opts=""
while (( $# > 1 ))
do
opts="$opts $1" ; shift
done
typeset dg=$1
for vol in /dev/vx/rdsk/$dg/*
do
[[ -z "$opts" ]] && show_users $vol
[[ -n "$opts" ]] && fuser $opts $vol 2>&1 | awk -F: '$2 !~ "^ *$"'
done
}

function deactivate_dg
{
typeset action=$1
typeset dg=$2
typeset -i retval=0
typeset -i cmd_status=1
typeset out
case "$action" in
("deactivate")
out=$(vxdg -g $dg set activation=off 2>&1)
cmd_status=$?
;;
("deport")
out=$(vxdg deport $dg 2>&1)
cmd_status=$?
;;
esac
# show output if failure. if retryable, run fuser and return 2.
if (( cmd_status != 0 ))
then
print "vxdg failed with exit code $cmd_status"
print "$out"
retval=1
echo "$out" | grep -qi -e 'in use' -e 'is open' && retval=2
(( retval == 2 )) && dg_fuser $dg
fi

return $retval
}

function deactivate_dg_with_retries
{
typeset action=$1
typeset dg=$2
integer num_retries=0
integer ret=0

deactivate_dg $action $dg
ret=$?

while (( ret == 2 && num_retries < DEACTIVATION_RETRY_COUNT ))
do
print "\t$(date '+%b %e %X') - failed to $action disk group $dg, trying again."
if [[ $KILL_PROCESSES_ACCESSING_RAW_DEVICES = "YES" ]]
then
print "\tWARNING: Running fuser to remove anyone using the raw device directly."
dg_fuser -ku $dg
fi
# vxvm has background processes that can touch dgs/volumes, so
# wait a bit to let those finish before retrying
cmsleep 1
deactivate_dg $action $dg
ret=$?
(( ret == 0 )) && print "Retry of $action on disk group $dg succeeded"

(( num_retries = num_retries + 1 ))
done

return $ret
}

function deactivate_disk_group
{
integer ret

for I in ${CVM_DG[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Deactivating disk group $I"
deactivate_dg_with_retries deactivate $I
ret=$?
if (( ret != 0 ))
then
let 0
test_return 25
fi
done

for I in ${VXVM_DG[@]}
do
print "$(date '+%b %e %X') - Node \"$(hostname)\": Deporting disk group $I"
deactivate_dg_with_retries deport $I
ret=$?
if (( ret != 0 ))
then
let 0
test_return 26
fi
done
}

# END OF HALT FUNCTIONS.

# FUNCTIONS COMMON TO BOTH RUN AND HALT.

# Test return value of functions and exit with NO RESTART if bad.
# Return value of 0 - 50 are reserved for use by Hewlett-Packard.
# System administrators can use numbers above 50 for return values.
function test_return
{
if (( $? != 0 ))
then
integer to_exit=0
case $1 in
1)
print "\tERROR: Function activate_volume_group"
print "\tERROR: Failed to activate $I"
exit_value=1
;;

2)
print "\tERROR: Function check_and_mount"
print "\tERROR: Failed to fsck one of the logical volumes."
exit_value=1
;;

3)
print "\tERROR: Function check_and_mount"
print "\tERROR: Failed to mount $I"
exit_value=1
;;

4)
print "\tERROR: Function add_ip_address"
print "\tERROR: Failed to add IP address to subnet"
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

5)
print "\tERROR: Function get_ownership_dtc"
print "\tERROR: Failed to own $I"
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

6)
print "\tERROR: Function get_ownership_dtc"
print "\tERROR: Failed to switch $I"
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

7)
print "\tERROR: Function wait_for_cvm_dg_vols_enabled"
print "\tERROR: Failed to enable $I"
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

8)
print "\tERROR: Function start_services"
print "\tERROR: Failed to start service ${SERVICE_NAME[$C]}"
halt_services
customer_defined_halt_cmds
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

9)
print "\tFunction halt_services"
print "\tWARNING: Failed to halt service $I"
;;

11)
print "\tERROR: Function disown_dtc"
print "\tERROR: Failed to disown $I from ${SUBNET[$S]}"
exit_value=1
;;

12)
print "\tERROR: Function remove_ip_address"
print "\tERROR: Failed to remove $I"
exit_value=1
;;

13)
print "\tERROR: Function umount_fs"
print "\tERROR: Failed to unmount $I"
exit_value=1
;;

14)
print "\tERROR: Function deactivate_volume_group"
print "\tERROR: Failed to deactivate $I"
exit_value=1
;;

15)
print "\tERROR: Function start_resources"
print "\tERROR: Failed to start resource $I"
stop_resources
halt_services
customer_defined_halt_cmds
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

16)
print "\tERROR: Function stop_resources"
print "\tERROR: Failed to stop resource $I"
exit_value=1
;;

17)
print "\tERROR: Function freeup_busy_mountpoint_and_mount_fs"
print "\tERROR: Failed to mount $I to ${FS[$F]}"
exit_value=1
;;

21)
print "\tERROR: Function activate_disk_group"
print "\tERROR: Failed to activate $I"
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

22)
print "\tERROR: Function check_dg failed"
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

23)
print "\tERROR: Function activate_disk_group"
print "\tERROR: Failed to import $I"
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

24)
print "\tERROR: Function activate_disk_group"
print "\tERROR: Failed to vxvol -g $I startall"
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

25)
print "\tERROR: Function deactivate_disk_group"
print "\tERROR: Failed to deactivate $I"
exit_value=1
;;

26)
print "\tERROR: Function deactivate_disk_group"
print "\tERROR: Failed to deport $I"
exit_value=1
;;

49)
print "\tERROR: Function verify_ha_nfs"
print "\tERROR: Failed to start NFS"
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

50)
print "\tERROR: Function verify_ha_nfs"
print "\tERROR: Failed to stop NFS"
exit_value=1
;;

51)
print "\tERROR: Function customer_defined_run_cmds"
print "\tERROR: Failed to RUN customer commands"
halt_services
customer_defined_halt_cmds
disown_dtc
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

52)
print "\tERROR: Function customer_defined_halt_cmds"
print "\tERROR: Failed to HALT customer commands"
exit_value=1
;;

53)
print "\tERROR: Function ha_nfs_file_locks failed"
remove_ip_address
verify_ha_nfs stop
umount_fs
deactivate_volume_group
deactivate_disk_group
to_exit=1
;;

*)
print "\tERROR: Failed, unknown error."
;;

esac

if (( $to_exit == 1 ))
then
print "###### Node \"$(hostname)\": Package start failed at $(date) ######"
exit 1
fi
fi
}

# END OF FUNCTIONS COMMON TO BOTH RUN AND HALT

#-------------------MAINLINE Control Script Code Starts Here-----------------
#
# FUNCTION STARTUP SECTION.

typeset MIN_VERSION="11.14" # Minimum version this control script works on

integer exit_value=0
typeset CUR_VERSION

#
# Check that this control script is being run on a A.10.03 or later release
# of Serviceguard or Serviceguard Extension for RAC. The control scripts are forward
# compatible but are not backward compatible because newer control
# scripts use commands and option not available on older releases.

CUR_VERSION=`$SGSBIN/cmversion | cut -f2-3 -d"."`

if [[ "${CUR_VERSION}" = "" ]] || \
[[ "${CUR_VERSION#*.}" < "${MIN_VERSION#*.}" ]]
then
print "ERROR: Mismatched control script version ($MIN_VERSION). You cannot run"
print "\ta version ${MIN_VERSION} control_script on a node running pre"
print "\t${MIN_VERSION} Serviceguard or Serviceguard Extension for RAC software"
exit 1
fi

# Check that CONCURRENT_VGCHANGE_OPERATIONS is set to >=1.
if (( CONCURRENT_VGCHANGE_OPERATIONS < 1 ))
then
print "\tWARNING: Invalid CONCURRENT_VGCHANGE_OPERATIONS value. Defaulting it to 1."
CONCURRENT_VGCHANGE_OPERATIONS=1
fi

# Check that CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS is set to >=1.
if (( CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS < 1 ))
then
print "\tWARNING: Invalid CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS value. Defaulting it to 1."
CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS=1
fi

# Check that CONCURRENT_FSCK_OPERATIONS is set to >=1.
if (( CONCURRENT_FSCK_OPERATIONS < 1 ))
then
print "\tWARNING: Invalid CONCURRENT_FSCK_OPERATIONS value. Defaulting it to 1."
CONCURRENT_FSCK_OPERATIONS=1
fi

# If multi-threaded version of vgchange is available and user wants to use it then
# use -T option.
vgdisplay -F 2> /dev/null | grep -q vg_version
if (($? == 0 && $ENABLE_THREADED_VGCHANGE == 1))
then
VGCHANGE=$VGCHANGE" -T"
fi

# Test to see if we are being called to run the package, or halt the package.

if [[ $1 = "start" ]]
then
print "\n\t########### Node \"$(hostname)\": Starting package at $(date) ###########"

verify_physical_data_replication # add hook for MetroCluster

activate_volume_group

activate_disk_group

check_and_mount

verify_ha_nfs $1 # add hook for NFS

add_ip_address

ha_nfs_file_locks

get_ownership_dtc

wait_for_cvm_dg_vols_enabled

customer_defined_run_cmds

start_services

start_resources

# Check exit value

if (( $exit_value == 1 ))
then
print "\n\t########### Node \"$(hostname)\": Package start failed at $(date) ###########"
exit 1
else
print "\n\t########### Node \"$(hostname)\": Package start completed at $(date) ###########"
exit 0
fi

elif [[ $1 = "stop" ]]
then
print "\n\t########### Node \"$(hostname)\": Halting package at $(date) ###########"

stop_resources

halt_services

customer_defined_halt_cmds

disown_dtc

remove_ip_address

verify_ha_nfs $1 # add hook for NFS

umount_fs

deactivate_volume_group

deactivate_disk_group

# Check exit value
if (( $exit_value == 1 ))
then
print "\n\t########### Node \"$(hostname)\": Package halt failed at $(date) ###########"
exit 1
else
print "\n\t########### Node \"$(hostname)\": Package halt completed at $(date) ###########"
exit 0
fi

fi

regards

MC

SoorajCleris · ‎10-30-2010

Hi Madhu,

Would you mind, if I request you to attach syslog, package control log file, #cmviewconf output as a textfile?

Regards,
Sooraj

"UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity" - Dennis Ritchie

madhuchakkaravarthy · ‎10-30-2010

not able to send syslog, since no accesss to server now.

package control log

########### Node "TRSAP102": Halting package at Fri, Oct 8, 2010 01:42:40 AM ###########
Oct 8 01:42:40 AM - Node "TRSAP102": Remove IP address 10.59.116.196 from subnet 10.59.116.0
Oct 8 01:42:40 AM - Node "TRSAP102": Unmounting filesystem on /dev/vg218/lvoltrappGB
Oct 8 01:42:40 AM - Node "TRSAP102": Deactivating volume group vg218
Deactivated volume group in Exclusive Mode.
Volume group "vg218" has been successfully changed.

########### Node "TRSAP102": Package halt completed at Fri, Oct 8, 2010 01:42:40 AM ###########

########### Node "TRSAP102": Starting package at Fri, Oct 8, 2010 02:00:35 AM ###########
Oct 8 02:00:35 AM - Node "TRSAP102": Activating volume group vg218 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg218" has been successfully changed.
Oct 8 02:00:36 AM - Node "TRSAP102": Checking filesystems:
/dev/vg218/lvoltrappGB
/dev/vg218/rlvoltrappGB:file system is clean - log replay is not required
Oct 8 02:00:36 AM - Node "TRSAP102": Mounting /dev/vg218/lvoltrappGB at /fintrappGB
Oct 8 02:00:36 AM - Node "TRSAP102": Adding IP address 10.59.116.196 to subnet 10.59.116.0

########### Node "TRSAP102": Package start completed at Fri, Oct 8, 2010 02:00:36 AM ###########

########### Node "TRSAP102": Halting package at Sat, Oct 16, 2010 04:23:10 PM ###########
Oct 16 04:23:10 PM - Node "TRSAP102": Remove IP address 10.59.116.196 from subnet 10.59.116.0
Oct 16 04:23:10 PM - Node "TRSAP102": Unmounting filesystem on /dev/vg218/lvoltrappGB
umount: cannot unmount /dev/vg218/lvoltrappGB : Device busy
umount: return error 1.
/dev/vg218/lvoltrappGB in use by:
23496: sleep 300
child of 19896: /bin/ksh .//Rd3k.sh
19209: /fintrappGB/releases/ftuklive7050/client/hpux_ia/iqmain_r appserverfac /@ftuk SBIUK_7050 Infosys/AppServerFac_SBIUK_7050 -ORBDo
child of 19027: /fintrappGB/releases/ftuklive7050/client/hpux_ia/iqmain_r ftcorbadaemon -ORBEndpoint iiop://TRGB:10115 -ORBDottedDecimalAddress
19896: /bin/ksh .//Rd3k.sh
19027: /fintrappGB/releases/ftuklive7050/client/hpux_ia/iqmain_r ftcorbadaemon -ORBEndpoint iiop://TRGB:10115 -ORBDottedDecimalAddress
WARNING: Running fuser to remove anyone using the file system directly.
/dev/vg218/lvoltrappGB: 23496co(ftadm) 19209mcto(ftadm) 19896co(ftadm) 19027mcto(ftadm)

umount: cannot unmount /dev/vg218/lvoltrappGB : Device busy
umount: return error 1.
ERROR: Function umount_fs
ERROR: Failed to unmount /dev/vg218/lvoltrappGB
Oct 16 04:23:11 PM - Node "TRSAP102": Deactivating volume group vg218
vgchange: Couldn't deactivate volume group "vg218":
Device busy
Oct 16 04:23:11 PM - vgchange -a n vg218 failed, trying again.
vgchange: Couldn't deactivate volume group "vg218":
Device busy
Oct 16 04:23:12 PM - vgchange -a n vg218 failed, trying again.
vgchange: Couldn't deactivate volume group "vg218":
Device busy
ERROR: Function deactivate_volume_group
ERROR: Failed to deactivate vg218

########### Node "TRSAP102": Package halt failed at Sat, Oct 16, 2010 04:23:13 PM ###########

########### Node "TRSAP102": Starting package at Sat, Oct 16, 2010 04:54:21 PM ###########
Oct 16 04:54:21 PM - Node "TRSAP102": Activating volume group vg218 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg218" has been successfully changed.
Oct 16 04:54:21 PM - Node "TRSAP102": Checking filesystems:
/dev/vg218/lvoltrappGB
/dev/vg218/rlvoltrappGB:file system is clean - log replay is not required
Oct 16 04:54:21 PM - Node "TRSAP102": Mounting /dev/vg218/lvoltrappGB at /fintrappGB
Oct 16 04:54:21 PM - Node "TRSAP102": Adding IP address 10.59.116.196 to subnet 10.59.116.0

########### Node "TRSAP102": Package start completed at Sat, Oct 16, 2010 04:54:21 PM ###########

########### Node "TRSAP102": Starting package at Fri, Oct 22, 2010 04:40:44 PM ###########
Oct 22 04:40:44 PM - Node "TRSAP102": Activating volume group vg218 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg218" has been successfully changed.
Oct 22 04:40:44 PM - Node "TRSAP102": Checking filesystems:
/dev/vg218/lvoltrappGB
/dev/vg218/rlvoltrappGB:file system is clean - log replay is not required
Oct 22 04:40:44 PM - Node "TRSAP102": Mounting /dev/vg218/lvoltrappGB at /fintrappGB
Oct 22 04:40:44 PM - Node "TRSAP102": Adding IP address 10.59.116.196 to subnet 10.59.116.0

########### Node "TRSAP102": Package start completed at Fri, Oct 22, 2010 04:40:44 PM ###########

###############################################################

cmviewconf

root @ TRSAP102 :/>cmviewconf

Cluster information:

cluster name: TRSAPCLU101
version: 0
flags: 12 (single cluster lock)
heartbeat interval: 1.00 (seconds)
node timeout: 20.00 (seconds)
heartbeat connection timeout: 0.00 (seconds)
io timeout extension: 0.00 (seconds)
auto start timeout: 600.00 (seconds)
network polling interval: 2.00 (seconds)
network failure detection: INOUT
first lock vg name: /dev/vg clust lock
second lock vg name: (not configured)
qs host: (not configured)

Cluster Node information:

Node ID 1:
Node name: TRSAP101
first lock pv name: /dev/disk/disk78
first lock disk interface type: fcd

Network ID 1:
ppa: 2
old ppa: 0
mac addr: 0x001e0bfc5f4e
hardware path: 0/2/2/0
network interface name: lan2

IPv4 Information:
subnet: 10.59.118.0
subnet mask: 255.255.255.0
ip address: 10.59.118.32

route id: 1

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 3:
ppa: 4
old ppa: 0
mac addr: 0x001b7878a8a8
hardware path: 0/4/0/0/0/0/2/0/0/0
network interface name: lan4

IPv4 Information:
subnet: 192.168.2.16
subnet mask: 255.255.255.248
ip address: 192.168.2.17

route id: 2

IPv6 Information:

flags: 5 (Heartbeat Network)
bridged net ID: 2

Network ID 5:
ppa: 0
old ppa: 0
mac addr: 0x001e0bfc5f50
hardware path: 0/1/1/0
network interface name: lan0

IPv4 Information:
subnet: 10.59.116.0
subnet mask: 255.255.255.0
ip address: 10.59.116.32

route id: 3

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 3

Network ID 2:
ppa: 1
old ppa: 0
mac addr: 0x001e0bfc5f51
hardware path: 0/1/1/1
network interface name: lan1

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 4:
ppa: 5
old ppa: 0
mac addr: 0x001b7878a8a9
hardware path: 0/4/0/0/0/0/2/0/0/1
network interface name: lan5

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 2

Network ID 6:
ppa: 3
old ppa: 0
mac addr: 0x001e0bfc5f4f
hardware path: 0/2/2/1
network interface name: lan3

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 3

Node ID 2:
Node name: TRSAP103
first lock pv name: /dev/disk/disk79
first lock disk interface type: fcd

Network ID 1:
ppa: 2
old ppa: 0
mac addr: 0x001e0bfc5f0e
hardware path: 0/2/2/0
network interface name: lan2

IPv4 Information:
subnet: 10.59.118.0
subnet mask: 255.255.255.0
ip address: 10.59.118.34

route id: 1

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 3:
ppa: 4
old ppa: 0
mac addr: 0x001b7878acfc
hardware path: 0/4/0/0/0/0/2/0/0/0
network interface name: lan4

IPv4 Information:
subnet: 192.168.2.16
subnet mask: 255.255.255.248
ip address: 192.168.2.19

route id: 2

IPv6 Information:

flags: 5 (Heartbeat Network)
bridged net ID: 2

Network ID 5:
ppa: 0
old ppa: 0
mac addr: 0x001e0bfc5f10
hardware path: 0/1/1/0
network interface name: lan0

IPv4 Information:
subnet: 10.59.116.0
subnet mask: 255.255.255.0
ip address: 10.59.116.34

route id: 3

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 3

Network ID 2:
ppa: 1
old ppa: 0
mac addr: 0x001e0bfc5f11
hardware path: 0/1/1/1
network interface name: lan1

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 4:
ppa: 5
old ppa: 0
mac addr: 0x001b7878acfd
hardware path: 0/4/0/0/0/0/2/0/0/1
network interface name: lan5

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 2

Network ID 6:
ppa: 3
old ppa: 0
mac addr: 0x001e0bfc5f0f
hardware path: 0/2/2/1
network interface name: lan3

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 3

Node ID 3:
Node name: TRSAP102
first lock pv name: /dev/disk/disk71
first lock disk interface type: fcd

Network ID 1:
ppa: 2
old ppa: 0
mac addr: 0x001e0bfc5f46
hardware path: 0/2/2/0
network interface name: lan2

IPv4 Information:
subnet: 10.59.118.0
subnet mask: 255.255.255.0
ip address: 10.59.118.33

route id: 1

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 3:
ppa: 4
old ppa: 0
mac addr: 0x001b7878ae58
hardware path: 0/4/0/0/0/0/2/0/0/0
network interface name: lan4

IPv4 Information:
subnet: 192.168.2.16
subnet mask: 255.255.255.248
ip address: 192.168.2.18

route id: 2

IPv6 Information:

flags: 5 (Heartbeat Network)
bridged net ID: 2

Network ID 5:
ppa: 0
old ppa: 0
mac addr: 0x001e0bfc5f48
hardware path: 0/1/1/0
network interface name: lan0

IPv4 Information:
subnet: 10.59.116.0
subnet mask: 255.255.255.0
ip address: 10.59.116.33

route id: 3

IPv6 Information:

flags: 4 (Non-Heartbeat Network)
bridged net ID: 3

Network ID 2:
ppa: 1
old ppa: 0
mac addr: 0x001e0bfc5f49
hardware path: 0/1/1/1
network interface name: lan1

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 1

Network ID 4:
ppa: 5
old ppa: 0
mac addr: 0x001b7878ae59
hardware path: 0/4/0/0/0/0/2/0/0/1
network interface name: lan5

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 2

Network ID 6:
ppa: 3
old ppa: 0
mac addr: 0x001e0bfc5f47
hardware path: 0/2/2/1
network interface name: lan3

IPv4 Information:
subnet: 0.0.0.0
subnet mask: 0.0.0.0
ip address: 0.0.0.0

route id: 0

IPv6 Information:

flags: 2 (Non-Heartbeat Network)
bridged net ID: 3

Cluster Access Policy Information: (Not Defined)

Package information:

maximum configured packages: 30

package ID 64769:
package name: TRASIA
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRASIA/TRASIA.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRASIA/TRASIA.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP101
package alternate node: TRSAP103

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 50946:
package name: TRMU
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRMU/TRMU.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRMU/TRMU.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP101
package alternate node: TRSAP103

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 18947:
package name: TRUS
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRUS/TRUS.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRUS/TRUS.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP103
package alternate node: TRSAP101

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 40964:
package name: TRDE
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRDE/TRDE.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRDE/TRDE.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP102
package alternate node: TRSAP101

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 3589:
package name: TREUR
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TREUR/TREUR.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TREUR/TREUR.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP102
package alternate node: TRSAP101

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 40711:
package name: TRNP
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRNP/TRNP.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRNP/TRNP.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP101
package alternate node: TRSAP102

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 38152:
package name: TRAFRO
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRAFRO/TRAFRO.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRAFRO/TRAFRO.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP102
package alternate node: TRSAP103

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 4873:
package name: TRGB
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRGB/TRGB.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRGB/TRGB.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP102
package alternate node: TRSAP103

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 26122:
package name: TRUSCA
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRUSCA/TRUSCA.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRUSCA/TRUSCA.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP103
package alternate node: TRSAP102

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 4107:
package name: TRIN
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRIN/TRIN.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRIN/TRIN.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP101
package alternate node: TRSAP103

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

package ID 63500:
package name: TRCA
package global flags: 5
(Package Switch Enabled)
(Package Local Switch Enabled)
(Configured Node Failover)
(Manual Failback)
package priority: (No Priority)
package run script: /etc/cmcluster/TRCA/TRCA.cntl
package run timeout: (No Timeout)
package halt script: /etc/cmcluster/TRCA/TRCA.cntl
package halt timeout: (No Timeout)
package successor halt timeout: (No Timeout)
package primary node: TRSAP103
package alternate node: TRSAP101

package services: (Not Defined)

package dependencies: (Not Defined)

package access policies: (Not Defined)

regards

MC

Viktor Balogh · ‎10-31-2010

Hi madhuchakkaravarthy,

please use the "attach" button, because these long posts are screwing the screen and the output is hard to understand because of the limited line lenght.

Regards,
Viktor

****
Unix operates with beer.

Matti_Kurkela · ‎10-31-2010

There is no special "checklist" for unmounting the filesystems. When halting the package, Serviceguard first runs the configured package halt commands, then tries to unmount all the filesystems configured in the package control file. If one of those filesystems is still being accessed by some process, then the unmount attempt will fail for that filesystem. A process doesn't have to belong to the package to cause this problem: it might be a shell session that has been cd'd to the package filesystem and left to idle (or not logged out properly).

If the control file has the setting FS_UMOUNT_COUNT set to 1 (the default value), the failure to unmount will be fatal: as the filesystem cannot be unmounted while there are processes using it, the package halt operation cannot be completed. If this halt operation is part of a package failover, then the failover will not be succesful.

If FS_UMOUNT_COUNT is set to value greater than 1, then if the first unmount operation fails, Serviceguard will use "fuser -k" to kill the remaining processes using the package filesystem, then it will wait a few seconds for the processes to die, and then it will try to unmount the filesystem again. If FS_UMOUNT_COUNT is set to value greater than 2, this cycle can be repeated the specified number of times.

In my experience, it has been very useful to set FS_UMOUNT_COUNT to 3. But if you do that, you should make *very* sure all the application processes that might be harmed if they're stopped with kill -9 are stopped in a controlled fashion by your customer_defined_halt_cmds configuration.

In your package control file, you have no services defined, and all the application start/stop commands in customer_defined_run_cmds and customer_defined_halt_cmds are commented out??? Why? Combined with FS_UMOUNT_COUNT=1 setting, this means your package can *NEVER* make a successful automatic failover if the application is running when the failover is triggered.

MK

MK

madhuchakkaravarthy · ‎10-31-2010

hi

In your package control file, you have no services defined, and all the application start/stop commands in customer_defined_run_cmds and customer_defined_halt_cmds are commented out??? Why? Combined with FS_UMOUNT_COUNT=1 setting, this means your package can *NEVER* make a successful automatic failover if the application is running when the failover is triggered

---------------------------->

since DR site applcation is down here,

and thanks for kind support.

regards

MC

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

not able to halt a node before unmounting some Filesystem

not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem

Re: not able to halt a node before unmounting some Filesystem