- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- One server would not come up when other is down in...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 01:27 AM
03-17-2008 01:27 AM
One server would not come up when other is down in a cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 01:35 AM
03-17-2008 01:35 AM
Re: One server would not come up when other is down in a cluster
Point being - not enough information...
Is the cluster actually up on server A?
What does 'cmviewcl -v' show you on serverA?
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 01:43 AM
03-17-2008 01:43 AM
Re: One server would not come up when other is down in a cluster
1. cmviewcl -v
2. cfscluster status
3. more /etc/exports
4. syslog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 02:07 AM
03-17-2008 02:07 AM
Re: One server would not come up when other is down in a cluster
The command hangs
2. cfscluster status
As cmviewcl doesn't showing any output,so..
3.bash-2.05b# more /etc/exports
/software_mount -root=lcsnew3:nfsPkg:lcsnew1
/data_mount -root=lcsnew3:nfsPkg:lcsnew1
/opt/gmlc/logs -root=lcsnew3,root=lcsnew3
4.This is the tail of syslogs
rt.
Mar 17 11:52:24 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:51:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 11:52:25 lcsnew3 above message repeats 11 times
Mar 17 11:52:24 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 11:52:25 lcsnew3 above message repeats 4 times
Mar 17 11:52:25 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 11:52:25 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:52:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 13:55:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 11:56:28 lcsnew3 above message repeats 11 times
Mar 17 11:56:28 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 11:56:28 lcsnew3 above message repeats 24 times
Mar 17 11:56:28 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:56:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 14:00:00 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 12:00:30 lcsnew3 above message repeats 11 times
Mar 17 12:00:30 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 12:00:30 lcsnew3 above message repeats 26 times
Mar 17 12:00:30 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 12:00:37 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 12:00:40 lcsnew3 above message repeats 13 times
Mar 17 12:00:41 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 14:00:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 14:04:00 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 12:04:31 lcsnew3 above message repeats 11 times
Mar 17 12:04:31 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 12:04:31 lcsnew3 above message repeats 9 times
Mar 17 12:04:35 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 14:04:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 14:05:00 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
Mar 17 12:05:00 lcsnew3 above message repeats 2 times
Mar 17 12:05:00 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 12:05:00 lcsnew3 above message repeats 4 times
Mar 17 14:05:59 lcsnew3 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to sta
rt.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 02:21 AM
03-17-2008 02:21 AM
Re: One server would not come up when other is down in a cluster
#ps -ef | grep -i cmcld
if it is not running,
execute #cmrunnode -n
and then #cmviewcl -v and syslog again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 02:50 AM
03-17-2008 02:50 AM
Re: One server would not come up when other is down in a cluster
that is a normal and expected behaviour of Serviceguard. When a SG node comes up it expects to have the SG cluster services already active at least on one node, if that is not so then the SG node coming up waits for AUTO_START_TIMEOUT (defined into the cluster ascii configuration file) and then it fails to run cluster services.
You have to run
cmruncl -v -f -n
to have SG cluster starting.
Once the cluster is ran then other nodes (on their reboot) will join the already active cluster.
One more info: this kind of behaviour occurs because rc script of SG contain cmrunnod, the following contents of man cmrunnode:
DESCRIPTION
cmrunnode causes a node to start its cluster daemon to join the
existing cluster. This command verifies the network configuration
before causing the node to start its cluster daemon.
So cmrunnode should have an existing cluster to work well.
I hope that helps you, let me know if something is not clear.
Best regards,
Fabio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 03:02 AM
03-17-2008 03:02 AM
Re: One server would not come up when other is down in a cluster
The command made cluster up but one of the nfsPkg package is still down.And a volume group is also down.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 03:23 AM
03-17-2008 03:23 AM
Re: One server would not come up when other is down in a cluster
so the response to the original request of this thread was good, I'm glad for this.
Now if you need more help by us you should elaborate a little more the problem about "one of the nfsPkg package is still down.And a volume group is also down."
Some of the questions should be the following:
- Could you please describe better the configuration?
- Could you explain how many NFS filesystem you expect to get started with NFS package?
- Do all of them have problems to start currently or just one of them?
- Does the volume group that is down have any correlations with NFS package or is it another package and so another stuff?
- I suppose you have errors in syslog.log/pkg log files when NFS package is coming up, which errors?
Thanks for providing more info about your problem.
Best regards,
Fabio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 03:31 AM
03-17-2008 03:31 AM
Re: One server would not come up when other is down in a cluster
I am giving some run of commands, I hope it will give you the information you need.
bash-2.05b# vgdisplay
--- Volume groups ---
VG Name /dev/vg00
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 8
Open LV 8
Max PV 16
Cur PV 1
Act PV 1
Max PE per PV 4356
VGDA 2
PE Size (Mbytes) 32
Total PE 4346
Alloc PE 2026
Free PE 2320
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0
VG Name /dev/vg_rac
VG Write Access read/write
VG Status available, shared, server
Max LV 255
Cur LV 28
Open LV 28
Max PV 16
Cur PV 2
Act PV 2
Max PE per PV 18728
VGDA 4
PE Size (Mbytes) 4
Total PE 18749
Alloc PE 18156
Free PE 593
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0
vgdisplay: Volume group not activated.
vgdisplay: Cannot display volume group "/dev/vg01".
bash-2.05b#
bash-2.05b#
bash-2.05b# cmviewcl
CLUSTER STATUS
gmlcCluster up
NODE STATUS STATE
lcsnew3 up running
PACKAGE STATUS STATE AUTO_RUN NODE
vg_activate_pkg up running enabled lcsnew3
NODE STATUS STATE
lcsnew1 down unknown
UNOWNED_PACKAGES
PACKAGE STATUS STATE AUTO_RUN NODE
nfsPkg down halted disabled unowned
vg_activate_pkg_remote down halted enabled unowned
bash-2.05b#
tail of syslogs
Mar 17 13:01:13 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:01:13 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:01:13 lcsnew3 above message repeats 2 times
Mar 17 13:01:13 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:05:31 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:05:31 lcsnew3 above message repeats 23 times
Mar 17 13:05:35 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:06:00 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:06:00 lcsnew3 above message repeats 8 times
Mar 17 13:06:00 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:09:38 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:09:38 lcsnew3 above message repeats 3 times
Mar 17 13:09:41 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:10:00 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:10:00 lcsnew3 above message repeats 8 times
Mar 17 13:10:04 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:12:20 lcsnew3 CM-CMD[25886]: cmrunpkg nfsPkg
Mar 17 13:12:33 lcsnew3 CM-CMD[25886]: Request from root on node lcsnew3 to start package nfsPkg
Mar 17 13:10:08 lcsnew3 automountd[1135]: server nfsPkg not responding
Mar 17 13:12:33 lcsnew3 above message repeats 7 times
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Request from root on node lcsnew3 to start package nfsPkg
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Request from node lcsnew3 to start package nfsPkg on node lcsn
ew3.
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Executing '/etc/cmcluster/nfsPkg/nfsPkg.cntl start' for packa
ge nfsPkg, as service PKG*6914.
Mar 17 13:12:33 lcsnew3 LVM[26239]: vgchange -a e vg01
Mar 17 13:12:33 lcsnew3 LVM[26298]: vgchange -a n vg01
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Service PKG*6914 terminated due to an exit(1).
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Package nfsPkg run script exited with NO_RESTART.
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Examine the file /etc/cmcluster/nfsPkg/nfsPkg.cntl.log for mor
e details.
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Switching disabled on package nfsPkg.
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Unable to start package nfsPkg. Node lcsnew3 is not able to ru
n it.
Mar 17 13:12:33 lcsnew3 CM-CMD[25886]: Request from root on node lcsnew3 to start package nfsPkg fai
led
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 03:56 AM
03-17-2008 03:56 AM
Re: One server would not come up when other is down in a cluster
what you posted answers to something I asked for. Now we know vg01 is related to NFS package, so you have the first point: vg01 is not active and that is normal and expected when you cannot start NFS package. So the first and most important point is:
why cannot NFS package start?
Until now we cannot know that but in syslog there is a good direction to continue the investigation:
Mar 17 13:12:33 lcsnew3 cmcld[27150]: Examine the file /etc/cmcluster/nfsPkg/nfsPkg.cntl.log for more details.
What into pkg log files?
Best regards,
Fabio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 04:23 AM
03-17-2008 04:23 AM
Re: One server would not come up when other is down in a cluster
the vg01 can be up with command vgchange -a e /dev/vg01, but when trying to run pkg using command "cmrunpkg nfsPkg" . It is making vg01 down again and the logs in cntl.log file is::-
########### Node "lcsnew3": Package start failed at Mon Mar 17 12:56:50 MST 2008 ###########
########### Node "lcsnew3": Starting package at Mon Mar 17 13:12:33 MST 2008 ###########
Mar 17 13:12:33 - Node "lcsnew3": Activating volume group vg01 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg01" has been successfully changed.
Mar 17 13:12:33 - Node "lcsnew3": Checking filesystems:
/dev/vg01/softwarelvol
/dev/vg01/datalvol
/dev/vg01/rsoftwarelvol:file system is clean - log replay is not required
/dev/vg01/rdatalvol:file system is clean - log replay is not required
Mar 17 13:12:33 - Node "lcsnew3": Mounting /dev/vg01/softwarelvol at /software_mount
vxfs mount: /dev/vg01/softwarelvol is already mounted, /software_mount is busy,
allowable number of mount points exceeded
ERROR: Function check_and_mount
ERROR: Failed to mount /dev/vg01/softwarelvol
Mar 17 13:12:33 - Node "lcsnew3": Deactivating volume group vg01
Deactivated volume group in Exclusive Mode.
Volume group "vg01" has been successfully changed.
########### Node "lcsnew3": Package start failed at Mon Mar 17 13:12:33 MST 2008 ###########
########### Node "lcsnew3": Starting package at Mon Mar 17 14:12:58 MST 2008 ###########
Mar 17 14:12:58 - Node "lcsnew3": Activating volume group vg01 with exclusive option.
Volume group "vg01" has been successfully changed.
Mar 17 14:12:58 - Node "lcsnew3": Checking filesystems:
/dev/vg01/softwarelvol
/dev/vg01/datalvol
/dev/vg01/rsoftwarelvol:file system is clean - log replay is not required
/dev/vg01/rdatalvol:file system is clean - log replay is not required
Mar 17 14:12:58 - Node "lcsnew3": Mounting /dev/vg01/softwarelvol at /software_mount
Mar 17 14:12:59 - Node "lcsnew3": Mounting /dev/vg01/datalvol at /data_mount
vxfs mount: /dev/vg01/datalvol is already mounted, /data_mount is busy,
allowable number of mount points exceeded
ERROR: Function check_and_mount
ERROR: Failed to mount /dev/vg01/datalvol
Mar 17 14:12:59 - Node "lcsnew3": Unmounting filesystem on /dev/vg01/softwarelvol
Mar 17 14:12:59 - Node "lcsnew3": Deactivating volume group vg01
Deactivated volume group in Exclusive Mode.
Volume group "vg01" has been successfully changed.
########### Node "lcsnew3": Package start failed at Mon Mar 17 14:12:59 MST 2008 ###########
bash-2.05b#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 04:23 AM
03-17-2008 04:23 AM
Re: One server would not come up when other is down in a cluster
Siju
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 04:44 AM
03-17-2008 04:44 AM
Re: One server would not come up when other is down in a cluster
#umount -f /data_mount
and also do a forceful unmount of all the mount points beloging to vg01.
#rm /etc/mnttab
#mount -a #to recreate /etc/mnttab
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 06:04 AM
03-17-2008 06:04 AM
Re: One server would not come up when other is down in a cluster
so you have one more info:
/dev/vg01/softwarelvol
/dev/vg01/datalvol
/dev/vg01/rsoftwarelvol:file system is clean - log replay is not required
/dev/vg01/rdatalvol:file system is clean - log replay is not required
These are the filesystem to be mounted on package starting but for some reasons a couple of them are already mounted.
What to do now:
- umount all filesystems of vg01 (in order to find out the opened files/processes running on filesystems use fuser and lsof - download that from <>
- check by bdf those filesystems are unmounted;
- start the package.
Best regards,
Fabio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2008 11:05 PM
03-17-2008 11:05 PM