Operating System - HP-UX
1833976 Members
1871 Online
110063 Solutions
New Discussion

ServiceGuard node rebooted

 
Jason_Ng
Occasional Visitor

ServiceGuard node rebooted

Hi Gurus,

 

Yesterday I did a failover simulation test by unplugging the Primary and Standby LAN cable on the node 1. Node 1 holding the package. The package able to failover to node 2.

 

After that I attached the Primary and Standby LAN cable to the NIC, I waited for while and than I bring up node 1 by entering the cmrunnode command. Once it is up, I unplugged node 2 Primary and Standby LAN cable to simulate failover. The package on node 2 did not move to node 1. Futhermore node 1 auto reboot the server.

 

Anyone know what went wrong?

 

HPUX 11.31

ServiceGuard A.11.20.00.01

 

 

7 REPLIES 7
Patrick Wallek
Honored Contributor

Re: ServiceGuard node rebooted

The likely scenario is that since the network went down SG sensed the failure and then the 2 server both tried to acquire the lock disk.  Server 2 probably succeeded which caused server 1 to TOC.

Jason_Ng
Occasional Visitor

Re: ServiceGuard node rebooted

any resolution for this? because I want to test the ServiceGuard by having a total node failure

Re: ServiceGuard node rebooted

Jason,

 

It would help if you posted the syslog and cluster package logs for both nodes - otherwise we are just guessing about what actually happened...

 

You say you removed the primary and standby network connection - is there a 3rd (heatbeat) network connection between these nodes? It's not an absolute _requirement_ of Serviceguard, but it is recommended.


I am an HPE Employee
Accept or Kudo
ALI-Wong
New Member

Re: ServiceGuard node rebooted

Hi Duncan,

There are a total of 4 Network Links per node for this config.

 

LAN-0 = Primary Data LAN

LAN-3 = Standby Data LAN

 

LAN-1 = Heartbeat-1

LAN-2 = Heartbeat-2

 

Questions:

1) When both Primary & Standby Data LAN gets disconnected what is the expected results/behavior?

2) When both Heartbeat gets disconnected what is the expected results/behavior?

 

- Jason's apprentice

Re: ServiceGuard node rebooted

Hi,

 

Well Serviceguard will behave how you tell it to behave, and we don't know how you have it configured... so as I said previously the syslog and package logs would help explain this, and so we know how things are configured it would help if you could also post the following:

 

mkdir /tmp/clustercfg

cmgetconf /tmp/clustercfg/cmclconfig.ascii

cmgetconf -P /tmp/clustercfg

 

Then post the contents of the /tmp/clustercfg directory (there should be one file there for the cluster itself and then one per package)

 


I am an HPE Employee
Accept or Kudo
Stephen Doud
Honored Contributor

Re: ServiceGuard node rebooted

When a package fails over to an adoptive node, it will not be allowed to move back to the primary node until it is authorized to do so.

 

Use the following command to inspect the value of AUTO_RUN and SWITCHING

cmviewcl -v -f <pkg_name>

 

When AUTO_RUN is flagged as 'disabled', the package is not allowed to run or move  to _any_ node.

To re-enable it, use:
cmmodpkg -e <pkg_name>

 

When SWITCHING is flagged as 'disabled', the package is not allowed to run or move to the identified node.

To re-enable this value, use:

cmmodpkg -e -n <node_name> <pkg_name>

user001
Frequent Advisor

Re: ServiceGuard node rebooted



I too had the same problem and quickly discovered that it wasn't allowed to switch back after a failure.