- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Strange behavoir in a cluster
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:21 AM
08-02-2006 04:21 AM
I have a two-node-cluster running HPUX 11.11 and MC SG 11.16. The cluster has an strange behavior that I don't know if it's normal, it has 2 LAN cards assigned for the heartbeat each node.
Node A is running 3 oracle packages, Node B is running 2 packages. When I disconnect primary and secondary heartbeat LAN from Node B; the cluster is in a reforming state but after that Node B goes down and the 2 packages failover begins to the other side , I mean Node A.
When I try to do the same test , this time Node A is running their 3 packages and Node B is running their 2 packages, I disconnect primary and secondary LAN cards from Node A ; the cluster is in a reforming state but after that Node B goes down and starts creating a dump.
I don’t know why Node A does not go down and makes the fail over of their 3 packages to Node B.
Both tests do the same, Node A remains up and Node B goes down.
It that OK?
Thank you for your quick response and your help.
Lissete C.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:32 AM
08-02-2006 04:32 AM
Re: Strange behavoir in a cluster
Check the OLDsyslog.log of node B after the second test, and see whether there is anything in there.
I suspect you are using a cluster lock disc, an dwhen you removed the heartbeats, tnode A grabbed the cluster lock, forcing node B to TOC.
do you only have the two LAN,s between the nodes? if tehre are more, put your hearteats over all of them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:32 AM
08-02-2006 04:32 AM
Re: Strange behavoir in a cluster
Secondly, when you unplug both network cards, the servers have to use cluster-lock methods to determine who should own the cluster. If they lose all connection to each other, then both nodes will assume that the other one might have failed, not that you unplugged the cables from their side of the switch.
-unplug both from B, B assumes a has failed, A also assumes B has failed.
-unplug both from A, A assumes B has failed, B also assumes A has failed.
The only difference to your tests is that the other server still has link/carrier connectivity to the switch (unless you are using crossovers but you didn't say that).
I am not suprised that you get different results, but I would half expect in this scenario that both servers might TOC and re-race for the cluster lock when they reboot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:34 AM
08-02-2006 04:34 AM
Re: Strange behavoir in a cluster
No, this does not sound normal. I think you should:
tail -f /var/adm/syslog/syslog.log on both systems, not connecting through the floating ip addresses and re-run your tests.
I'm assuming the objective here is for all 5 packages to run on one node when for any reason one node is out of the cluster.
sEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:36 AM
08-02-2006 04:36 AM
Re: Strange behavoir in a cluster
In any event, you should have yanked one network cable (which would simply trigger a LAN failover rather than a node switch), killed one network switch, yanked one SCSI cable, killed one disk array, yanked the power cord on one host, ... all of these are SPOF's but yanking more than one network cable is a MPOF. The fundamental problem is that although heartbeat was lost the conection to the lock disk was still attached to both nodes -- so a crapshoot results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:42 AM
08-02-2006 04:42 AM
Re: Strange behavoir in a cluster
The cluster has cluster lock disk, and the heart beat are not crossovers.
So there is no way to controle the cluster disk possesion? In order to make the test?
Regards,
Lissete C.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:47 AM
08-02-2006 04:47 AM
Re: Strange behavoir in a cluster
1) Edit the cluster control file, pick a different, present disk.
2) cmcheckconf
3) cmapplyconf
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:47 AM
08-02-2006 04:47 AM
Re: Strange behavoir in a cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:48 AM
08-02-2006 04:48 AM
Re: Strange behavoir in a cluster
The cluster should then stay up, an dthe packages halt on node A and move to node B (provided theya re monitoring teh subnet)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:48 AM
08-02-2006 04:48 AM
Re: Strange behavoir in a cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 04:57 AM
08-02-2006 04:57 AM
SolutionThe cluster lock disk is a disk that both systems race to control in the event that heartbeat is lost.
The system that loses the race goes TOC, transfer of control to prevent split brain syndrom from corrupting your shared data.
You can try to manipulate the outcome but really it is a race between electrons and nothing you can safely do will change the outcome.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2006 05:50 AM
08-02-2006 05:50 AM
Re: Strange behavoir in a cluster
The thing is not to confuse the Service Guard Principles
Thanks a lot for your help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2006 01:24 AM
08-03-2006 01:24 AM
Re: Strange behavoir in a cluster
If you configure your cluster this way, pulling BOTH private heartbeat cables will have absolutely NO effect on your running cluster.
-tjh