Operating System - HP-UX
1831589 Members
3325 Online
110025 Solutions
New Discussion

Serviceguard and file system monitoring

 
Anbalaga
New Member

Serviceguard and file system monitoring

Hi

I have a Serviceguard on HP 11.31. File system resource is not failing to the second node when I disable all paths connected to the primary node volume group. File system command just hangs.
Servicegurad is doing nothing. This is the first time I am creating the serviceguard
4 REPLIES 4
Suraj K Sankari
Honored Contributor

Re: Serviceguard and file system monitoring

Hi,

What are the error messages in syslog.log and cluster log and package log.
post it !

Suraj
Anbalaga
New Member

Re: Serviceguard and file system monitoring

Here is the logs . I have pasted syslog and pkg1 log , where do I found generic cluster log ?


pkg1 Log
=========
/dev/vgapp/rlvol1:file system is clean - log replay is not required
May 8 13:08:13 root@vrishabh filesystem.sh[10230]: Mounting /dev/vgapp/lvol1 at /app
May 8 13:08:13 root@vrishabh package_ip.sh[10260]: Adding IP address 172.21.164.245 to subnet 172.21.160.0
May 8 13:08:13 root@vrishabh master_control_script.sh[10180]: ###### Package start completed for pkg1 ######
#

Syslog
=======

May 8 16:37:52 vrishabh vmunix: condition. The transient time threshold is 120 seconds.
May 8 16:37:52 vrishabh vmunix: 2 lunpaths are currently in a failed state.
May 8 16:37:52 vrishabh vmunix:
May 8 16:37:52 vrishabh vmunix: class : disk, instance 250
May 8 16:37:52 vrishabh vmunix: All available lunpaths of a LUN (dev=0x1f1d1600)
May 8 16:37:52 vrishabh vmunix: have gone offline. The LUN has entered a transient
May 8 16:37:52 vrishabh vmunix: condition. The transient time threshold is 120 seconds.
May 8 16:37:52 vrishabh vmunix: 2 lunpaths are currently in a failed state.
May 8 16:37:52 vrishabh vmunix:
May 8 16:37:52 vrishabh vmunix: class : disk, instance 251
May 8 16:37:52 vrishabh vmunix: All available lunpaths of a LUN (dev=0x1f1d1700)
May 8 16:37:52 vrishabh vmunix: have gone offline. The LUN has entered a transient
May 8 16:37:52 vrishabh vmunix: condition. The transient time threshold is 120 seconds.
May 8 16:37:52 vrishabh vmunix: 2 lunpaths are currently in a failed state.
May 8 16:37:52 vrishabh vmunix:

# cmviewcl -v

CLUSTER STATUS
ANBU up

NODE STATUS STATE
mesha up running

Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/dsk/c92t0d0 up

Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up 0/1/1/0 lan0
PRIMARY up 0/1/1/1 lan1

NODE STATUS STATE
vrishabh up running

Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/disk/disk192 down

Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up 0/1/1/0 lan0
PRIMARY up 0/1/1/1 lan1

PACKAGE STATUS STATE AUTO_RUN NODE
pkg1 up running enabled vrishabh

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled mesha
Alternate up enabled vrishabh (current)
#
Stephen Doud
Honored Contributor

Re: Serviceguard and file system monitoring

Serviceguard does not inately detect a file system disconnect, ie: it does not initially monitor mount points or disks.
If you wish to create such a monitor, refer to the Managing Serviceguard manual. Edition 16, http://docs.hp.com/en/B3936-90140/B3936-90140.pdf page 46 bottom discusses this, "Monitoring LVM Disks Through Event Monitoring Service" and "Monitoring VxVM and CVM Disks"

LVM-based file systems require the installation of the EMS HA Monitors product and configuration of a monitor. (Do 'resls /dev//pv_summary')

However, be advised that such a monitor will require that, upon detection of a disconnected file system, a forced core dump/system reboot because HP-UX does not have the ability to umount a file system that is not reachable. This reboot will of course cause the package failover, which will get the file systems up on the other node (assuming the redundant disk connections are intact on the other node).

Anbalaga
New Member

Re: Serviceguard and file system monitoring

Thanks for the info. I will look into event monitoring of disk details.