Re: Problem ServiceGuard in a MSA500 Cluster Pack

Serviceguard for Linux · ‎05-24-2007

This is the way it is designed.

A "single node" cluster will not start by itself. It will wait for another node. If you want a single node to start by itself, you can force a cluster to start manually. (I'm not 100% sure but it may be cmruncl).

SnIphe · ‎05-24-2007

Ok, you are right, thats it the correct function of SG.
I can do cmruncl -n and the cluster will work with a single node.

OK thanks for the reply.

At this point, I want to know if there is some script, wich I can put in the init level, and I can make automatic the comand cmviewcl -n if the other node is down.
Because this cluster is going to a farm without workers. And I need to take the control of the cluster if the power fails and only one node starts.

I`m sure there is some script wich can solve this problem.

Thanks a lot.

Matti_Kurkela · ‎05-26-2007

If you automate the "cmruncl -n" command, your automation *MUST* include some way to ensure that the other node of the cluster is definitely down.

If the other node is active but isolated by a network problem, your "cmruncl -n" will effectively force a split-brain situation in the cluster: both nodes will access the package filesystems simultaneously, each node assuming that it's the only one. This will rapidly lead to filesystem corruption, which is not fixable by any means other than restoring from a backup.

This is exactly the reason why any ServiceGuard cluster node does not start in a single-node mode automatically.

MK

MK

SnIphe · ‎05-28-2007

Ok, I understando you, this sounds very logical. Thanks.

But, If I put something like this in the cluster.init script?

#
# Check to see if the daemon is already running
#
findproc cmcld
if [ "$pid" = "" ]
then
#
# The daemon isn't running already
#
isnodeup ingrids2
if [ "$node_status" = "down" ]
then
action "El nodo ingrids2 esta abajo, levantamos el cluster solo con el nodo ingrids1"
${SGSBIN}/cmruncl -v -f -n ingrids1
exit 0
fi

if [ -f ${SGSBIN}/cmrunnode ]
then
#
# Attempt to join the cluster
#

You know, I ask if the node "indgrids2" is up, and in the other node I ask for the "ingrids1"...
This works OK, but... If one node starts over this situation, and later I run the other node... I can find the problem you told me. It`s that right??

Thanks a lot.

skt_skt · ‎06-03-2007

SPlit brian syndrom wont come when you run the cmrunnode to make the failed node again part of the cluster.when the failed node comes back online both of the nodes will be able to communicate each other then.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Problem ServiceGuard in a MSA500 Cluster Pack