Operating System - HP-UX
1825719 Members
3187 Online
109686 Solutions
New Discussion

Error in failover in HP cluster

 
Dipu Krishnankutty_1
Occasional Contributor

Error in failover in HP cluster

We have a cluster of 2 HP unix machines. Both is having Oracle database running in it. The Data is stored in a SAN.
When both the machines (A and B) are connected in the cluster, it works fine with load balancing. When machine A is removed from the cluster, failover happens to machine B. When machine A is brought back, things resume to normal state and both machines work properly by load balancing.

The issue is: On the other hand if machine B is removed from the custer, the entire cluster setup crashes. More over since the machine crashes I do not have a platform to start my check.

Can any oone help on how to go about with fixing this problem

Thanks
6 REPLIES 6
Radhakrishnan Venkatara
Trusted Contributor

Re: Error in failover in HP cluster

Hi,

Could please let us know more details on the cluster.
Like which is configured node ..( primary .. alternative).
Quorum disk details.where is quorum disk kept ?
Cmviewcl output..

regards

Radhakrishnan
Negative thinking is a highest form of Intelligence
V.Tamilvanan
Honored Contributor

Re: Error in failover in HP cluster

Hi,
I believe you have configured your FAILBACK_POLICY as MANUAL. So whenever there is a Failure occured in Primary Node the package moves to your Standby Node. But when the Primary Node comes back the package won't move automatically. You need to manually enable the package to run on the Primary node. Then only if there is a failure in Secondary Node it will be moved to the Primary back.
The command is
#cmmodpkg -e -n

See the man pages of cmmodpkg.

# Enter the failback policy for this package. This policy will be used
# to determine what action to take when a package is not running on
# its primary node and its primary node is capable of running the
# package. The default policy unless otherwise specified is MANUAL.
# The MANUAL policy means no attempt will be made to move the package
# back to its primary node when it is running on an adoptive node.
#
# The alternative policy is AUTOMATIC. This policy will attempt to
# move the package back to its primary node whenever the primary node
# is capable of running the package.

FAILBACK_POLICY MANUAL
Sunil Sharma_1
Honored Contributor

Re: Error in failover in HP cluster

Hi,

You wrote load balansing. Is this cluster is MCSG or MC Lock manager (RAC).

Sunil
*** Dream as if you'll live forever. Live as if you'll die today ***
Radhakrishnan Venkatara
Trusted Contributor

Re: Error in failover in HP cluster

Hi Tamil,

I don't think that cause an issue over here.
I had configured and worked with many clusters which are having failback policy has manual.All matters there was your node switching parameters .


I think there may be issue related to quorum disk setup . or may be some other configuration

Radhakrishnan
Negative thinking is a highest form of Intelligence
Steven E. Protter
Exalted Contributor

Re: Error in failover in HP cluster

Most common causes:

package flags not set properly.

There should be some logging information in the package .log file to help.

Also wondering if after a cmhaltnode if cmmodpkg -e packagename was run.

Most likely its a configuration error because the cluster configuration is not consistent on both nodes.

SEP

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
V.Tamilvanan
Honored Contributor

Re: Error in failover in HP cluster

Hi RadhaKrishnan,
If your Package is configured for Manual FAILBACK_POLICY which is default and Your package's Primary Node failed the package Failover to the Adoptive Node. But once the Primary Node recovered and become the member of Cluster again the failovered package cannot failback from the Adoptive Node to the Original Node unless otherwise you issue a cmmodpkg -e command.

This is to ensure the integrity of the cluster. If suppose your package is configured for Automatic Failback config and your package has some problem it may go into loop by trying to start on both the servers one by one which may result in a problem.