- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- serviceguard cluster
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-25-2010 06:16 AM
тАО08-25-2010 06:16 AM
I recently join a company that has amongst other a 2 node cluster running hp-ux 11.33, its storage are in a netapps and its using veritas as opposed to LVM, and its MCSG version is A.11.19.00. I dont have any training on mcserviceguard, but I was asked to see why the cluster crash twice a month for no apparent reason. I did setup a script to check the network, but what I found was the following:
1. when running cmclview -v it return unknown lock disks
2. when going to sam the IP address of the server is not present (/etc/hosts), only one IP which I presume is from the heartbeat
3. when I go to /etc/cmcluster to view any logs, or configuration file I cannot see none of them.
Please can you tell where can I find the logs or any other relevant information, where this hearbeat IP is configured?
Please help
F.R.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-25-2010 09:05 AM
тАО08-25-2010 09:05 AM
Solutionhttp://docs.hp.com/en/ha.html
1. To view what the actual standing of the current cluster is, run:
cmscancl -o
That will create a file that will give you the information on the existing cluster and it's node. This is your starting point!
2. Run:
cmviewcl -v >
This create the above file and tells you detail information of the running cluster.
2. Look at /etc/hosts and confirm the following exists:
Each MC/SG node IP
Each MC/SG package IP
Each MC/SG heartbeat IP
Now make sure that exists in every node in their respective /etc/hosts file.
3. Your first job is too look and find out exactly what the lockdisk is. Make sure that your lockdisk can be seen by every node in your cluster.
Not all, but many, sometimes make the lockdisk a simple 1 disk volume group. If that is the case in your small cluster than confirm that the volume is not just active but owned by the cluster. Now, I know LVM, but I am not a veritas command person. So, if you find your lock disk is a separate volume group, then you need to make it owned by the cluster, and exclusive to the node it exists on.
Example using LVM commands on a currently active volume group (not part of cluster)
vgchange -a n /dev/vglock
vgchange -c y /dev/vglock
At this state, the cluster, when it starts up would take the /dev/vglock and change it to exclusive by running
vgchange -a e /dev/vglock
Lastly,
For logs:
Package logs are located (depends on your box) under the package subdirectory:
Ex: /etc/cmcluster/packages/
I would suggest you go back and review your syslog.log file to see what you can find first. There is always a reason....no apparent reason simply means they don't know SG.
I don't know how much this will help you, but I hope it does give you some starting point.
Regards,
Rita
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-25-2010 09:49 PM
тАО08-25-2010 09:49 PM
Re: serviceguard cluster
Thanks for your help, very good one. I├В┬┤m worried about this: when I run cmviewcl -v it tells me that lock disks are in an UNKNOWN state? Is this the problem? I will follow your advise, and I did found a pdf with some worksheets that I have fill in.
Thanks again
FR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 06:05 AM
тАО08-26-2010 06:05 AM
Re: serviceguard cluster
To reconstitute the cluster configuration ASII file, run:
# cd /etc/cmcluster
# cmgetconf cluster.ascii
The cluster lock FIRST_CLUSTER_LOCK_PV references for each node may differ if the device files on each node are ordered differently by instance number. Use 'ioscan -kfnC disk' on each node to compare the device file naming used for the given hardware paths.
What crash/panic messages are in /etc/shutdown.log?
If you have a software support contract with HP, you can engage us to help analye the crash dumps in /var/adm/crash.
For Serviceguard commands to operate correctly, every fixed IP on each server must be listed in /etc/hosts, and aliased to the simple hostname of that server. This requirement is validated in the Managing Serviceguard manual that Rita pointed you to.
Package log destination can be determined using either
# cmviewconf | grep log
or
# cmviewcl -v -f line | grep log
/var/adm/syslog/syslog.log captures some data that is helpful with conditions surrounding Serviceguard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 06:18 AM
тАО08-26-2010 06:18 AM
Re: serviceguard cluster
There is no panicks, nothing in /var/adm/crash. The only stange thing that I saw was the following in syslog file:Aug 13 00:10:50 dbnode0 cmdisklockd[4936]: Unable to convert device to I/O tree node: I/O tree node does not exist.
Aug 13 00:10:50 dbnode0 cmdisklockd[4936]: Failed to configure lock disk /dev/disk/disk97, will retry
Aug 13 00:10:52 dbnode0 cmserviced[4941]: Request to perform run service cmlockd
Aug 13 00:10:52 dbnode0 cmlockd[4948]: Changed to working directory /var/adm/cmcluster/cmlockd.
Aug 13 00:10:52 dbnode0 cmlockd[4948]: Executing command: rm -f /var/adm/cmcluster/.cmlock.*.socket
Aug 13 00:12:05 dbnode0 cmdisklockd[4936]: Unable to convert device to I/O tree node: I/O tree node does not exist.
# cat syslog.log | grep -i warning
Aug 13 00:10:28 dbnode0 vmunix: GAB WARNING V-15-1-20115 Port d registration failed, GAB not configured
Aug 13 00:10:28 dbnode0 vmunix: ODM WARNING V-41-6-5 odm_gms_api_start_msgs fails
#
If I run cmviewcl -v it shows me the status of lock disks as UNKNOW.
FR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 09:01 AM
тАО08-26-2010 09:01 AM
Re: serviceguard cluster
If you find that the disks are 'unknown' then that is definitely your problem. The disks can not be used in that state. Since 'unknown' can mean alot of things, and I have no idea what those disks are or how they are set up - you need to get the disks back to 'claimed'.
Or you need to set up new disk for lock disk, and change your cluster and nodes accordingly.
Hi Stephen !!
Regards,
Rita
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 10:12 AM
тАО08-26-2010 10:12 AM
Re: serviceguard cluster
Its very strange: If I do ioscan those disks are CLAIMED, but if I run cmviewcl -v it will show UNKNOWN. Its a pitty I'm not in the office now but tomorrow I'll send you attachments of o/p of those comands.
F.R.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 10:22 AM
тАО08-26-2010 10:22 AM
Re: serviceguard cluster
You may need to halt the cluster, activate the Cluster Lock VG(s), and then re-apply the cluster lock bits using cmapplyconf.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-26-2010 10:33 AM
тАО08-26-2010 10:33 AM
Re: serviceguard cluster
I'll comeback to you all tomorrow, on this side of the world its already dark 20:30PM local time, and I've already left the office.
I'll try tomorrow
F.R.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 12:10 AM
тАО08-27-2010 12:10 AM
Re: serviceguard cluster
this is the O/P of cmviewcl on one node:
NODE STATUS STATE
dbnode1 up running
Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/disk/disk100 unknown
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up LinkAgg0 lan900
PRIMARY up 0/0/6/1/0 lan2
STANDBY up 1/0/1/1/0/6/0 lan6
and that is the O/P on the other:
NODE STATUS STATE
dbnode0 up running
Cluster_Lock_LVM:
VOLUME_GROUP PHYSICAL_VOLUME STATUS
/dev/vglock /dev/disk/disk97 unknown
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY up LinkAgg0 lan900
PRIMARY up 0/0/6/1/0 lan2
STANDBY up 1/0/1/1/0/6/0 lan6
I├В┬┤ve got a feeling that this UNKNOWN status comes from the fact that lock disks must be configured in LVM and on this case it was used VERITAS. I might be wrong, correct me if so. As you can see the device filenames are in the new format (DSF v PERSISTENT). But the disks in vg00 are in the usual format /dev/dsk/CxTxDx
F.R.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 12:48 AM
тАО08-27-2010 12:48 AM
Re: serviceguard cluster
No, you CANNOT use a VxVM DG disk for Cluster Lock disk, it has to be in an LVM Volume Group.
Other methods are to use a Lock LUN (not in ANY VxVM DG or LVM VG, or a Quorum Server that is a node OUTSIDE the cluster.
>As you can see the device filenames are in the new format (DSF v PERSISTENT). But the disks in vg00 are in the usual format /dev/dsk/CxTxDx
The Legacy and Agile addressing can be mixed and matched, should be no problem.
The UNKNOWN state means that the disks that ar econfigured to be used as the cluster lock disk do NOT have the bits set to indicate this. You need to correct this as per my previous response, or contact your local HP Response Centre and log a call, requesting the unsupported cminitlock utilty
if you do NOT want to take the cluster down.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 12:56 AM
тАО08-27-2010 12:56 AM
Re: serviceguard cluster
I will log a cal with HP, but just one more query: if I do "ioscan -m dsf /dev/dsk/disk97" which is the LOCK disk, it shows me the corresponding "/dev/dsk/CxTxDx" and if I do ioscan for those corresponding disks, they are CLAIMED. So those disks seem to be fine but somehow there is a problem with them!!
F.R.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 01:31 AM
тАО08-27-2010 01:31 AM
Re: serviceguard cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 01:35 AM
тАО08-27-2010 01:35 AM
Re: serviceguard cluster
Thank you for your help, you too Rita, 10 points for both of you.
F.R.
next I├В┬┤m assign points
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-27-2010 04:32 AM
тАО08-27-2010 04:32 AM
Re: serviceguard cluster
F.R.