- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Node failure
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 02:42 AM
07-11-2003 02:42 AM
I installed a 2 nodes cluster with SG A.11.14 with one shared SCSI external drive (configured as lock device) and only one lan on each node (which is also the heartbeat lan).
I try to test a failover but the things do not go as I expect.
I have 2 cases:
1) I get out the network cable of node 1
2) I power off node 1
In both cases the second node reboots.
I expected that it will host all the resources previously on the node1.
After reboot node2 can not even form the cluster, complaining that it can not get the OS version of node1. Shouldn't it go on running the cluster?
Do I have to do a special configuration?
What could be not well configured?
Thanks,
Cristi
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 02:46 AM
07-11-2003 02:46 AM
Re: Node failure
How are the nodes connected via lans, and how is the scsi connected? what are the disc and controller scsi addresses? what do the syslogs and OLDsyslogs show on each node?
Read the manuals at http:/docs.hp.com/hpux/ha for an idea on how to configure the cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 02:53 AM
07-11-2003 02:53 AM
Re: Node failure
that is why you should have a phyiscally separate HeartBeat-LAN.
If you have no LAN communication between the nodes at all, they both run for the lock disk to decide which one has to TOC. The other one will carry on as a one node cluster. So chances are 50% the node you expected to TOC will TOC.....
This is called arbitration. There is a lot of information about it in the manuals at docs.hp.com
Regards
Bernhard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 03:51 AM
07-11-2003 03:51 AM
Re: Node failure
I thought that for the first case (when taking out the network cable from node1) the problem could come from the fact that there is no dedicated heartbeat way.
But when switching the power off on node1 I think it is not a heartbeat problem anymore and the second node should not TOC.
Maybe I am stil missing something. I will keep reading the manuals :)
The 2 nodes are connected to the company network (both are conected to a switch).
The external disk has 2 ends, one connected to node1 and the other to node2. It is powered separately.
After the reboot of node2 I can start the cluster with cmruncl -n node2 and the cluster runs well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 04:11 AM
07-11-2003 04:11 AM
Re: Node failure
If you are running short of NICs better you configure the heartbeat on RS232. Refer to the SG documentation on how to set this up. Also a quick requirements for heartbeat could be found at,
http://www.netsysco.com/pdf/Manuals/Sg/HeartbeatReq.pdf
Regards,
karthik S S
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 04:25 AM
07-11-2003 04:25 AM
Re: Node failure
Is your cluster lock disc actually working? what type of disc is it?
And I would not recommend a serial heartbeat unless you cann really not afford at least another lan card.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2003 04:48 AM
07-11-2003 04:48 AM
SolutionI believe there could be a problem with the binary cmclconfig file, since your assumptions for case 2 are correct. So delete them and do another chcheckconf / cmapplyconf.
One other thing to check is your .rhosts or cmclnodelist to include BOTH nodes on BOTH nodes. That could be the problem why node2 cannot form the cluster but a cmruncl -n node2 will work.
Regards,
Bernhard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-15-2003 08:54 AM
07-15-2003 08:54 AM
Re: Node failure
Thank you all for your help.
It seems that he problem was in the binary cluster config file which I did not compile/redistribute after I have changed the SCSI disk (with one with different SCSI ID).
Now if I shutdown a node the other takes over all the packages.
Thanks,
Cristi