Operating System - Linux
1839182 Members
5473 Online
110137 Solutions
New Discussion

Re: HPservice guard for LINUX

 
chanaka de mel
Occasional Contributor

HPservice guard for LINUX

We have baught the HP service guard for LINUX along with DL 380 package cluster ( two node cluster).
Proper configuration was done upto package adition to the cluster. This is a two node cluster and we have configured singel package for each node.
1.If you start the both nodes simultaniously the cluster will form and manually you can halt packages which are running in one node and start them manually in the other node properly
and it works fine along with mounting the logical volumes.
But if you bring down one node ( simulate a failure in a node) the packages will not switch over to the other node automatically.

2.Also if you boot one node while the other is down the cluster will not form at all.

would appriciate to find out the possible cause for this if possible.

Chanaka
6 REPLIES 6
Matti_Kurkela
Honored Contributor

Re: HPservice guard for LINUX

1.)
To get the package to auto-switch when there is a failure, two things must be enabled in the package: AUTO_RUN for the package and SWITCHING for the new node. Use "cmviewcl -v" to check these parameters.

When you halt a package manually, SG automatically also disables AUTO_RUN. When re-starting the package with cmrunpkg, you must explicitly re-enable AUTO_RUN.
The command for this is "cmmodpkg -e ".

If a package operation (starting or stopping a package) does not complete cleanly on a certain node, SWITCHING is automatically disabled for that node. (This happens to me frequently when testing a new package.)
To re-enable after fixing the problem, use "cmmodpkg -e -n ".

2.)
Is your single node detecting a possible split-brain situation? Check the syslog: the messages there should tell you why the cluster did not start.

A split-brain situation exists when both of your nodes are running but all the communication paths between them are broken. In this case both nodes could decide "I'm up and the other node is down" and try to start the packages. That would cause the two nodes using the same IP address and/or accessing the same shared disk without knowing about each other: a recipe for a disaster.

This is why a cluster lock is used in a two-node cluster. A node will start only if it can access the cluster lock OR it has a connection to the other node (so it knows what the other node is doing).
MK
melvyn burnard
Honored Contributor

Re: HPservice guard for LINUX

1. How are the packages configured? Did you set AUTO_RUN to YES in hte package config file?
What does cmviewcl -v show for the package switching attributes? If they have had switching disabled by manually issuing a cmhaltpkg then they may not be enabled to run on a particular node, or have global switching enabled.


2. This is normal behaviour. There MUST be 100% of the nodes available to form the cluster at the time of starting the cluster, otherwise the cluster will not start. In this situation you need to issue cmruncl -n to start th ecluster on the running node to get the cluster to start on a single node.

You may want to read:

http://docs.hp.com/linux/pdf/B9903-90033.pdf for more information.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
chanaka de mel
Occasional Contributor

Re: HPservice guard for LINUX

Yes AUTO_RUN is set to yes.
Also I used cmmodpkg â e packagename always when I halted a package.
Cmviewcl has switching enabled

I suppose its detecting a split brain situation some of the messages which I got in the /var/log/messages are as follows

Cmlocklund [1566] :write failure :1024 (resource temporarily unavailable )
Just cleared x-psn:1
Just cleared y-psn:2

When one node is brought down the below messages appear in /var/log/messages

Attempting to form a new cluster cluster lock was denied lock was owned by another node write failure-1

Burnard> is it normal behavior for the cluster not to form when all nodes are not up.My cluter is two node one.is it applicable to this as well. But when I run cmruncl in one node while the other is down packages which are configured in the down node also starts automaticall
Carlo Corthouts
Frequent Advisor

Re: HPservice guard for LINUX

I am even supprised MCSG is available on linux.
Steven E. Protter
Exalted Contributor

Re: HPservice guard for LINUX

I did a lab with this product at hpworld.

Good stuff.

Everything you need should be in the log files for the clusters. The error message on the failing node should lead you directly to the cause of the problem.

the check cluster and package commands worked right?

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Celso Medina Kern
Trusted Contributor

Re: HPservice guard for LINUX

Chanaka,

Cluster not forming:
Looking at cmrunnode man page...
cmruncl [-f] [-v] [-n node_name...] [-w none]

If node_name is not specified, the cluster daemons will be started on all the nodes in the cluster. All nodes in the cluster must be available for the cluster to start unless a subset of nodes is specified.

So, if you have one node down, the cluster will not form as designed. You can start it with cmruncl -f.

Package not automatically failing over:
You need to check if you are monitoring the package service, otherwise SG will not take any action to move it to another node.

If you post here your cluster ascii file and package ascii and control, it will be far easier to know why it is not working as you expect.

Best regards,

Celso
God bless pessimists, they did the backup!