- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Cluster suspended while one member had a defec...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 07:06 PM
тАО06-18-2007 07:06 PM
I had a problem where I can't find the solution. So I hope one of you can help me.
I have a cluster with 4 members (each member got 1 vote, expected votes=3, no quorum disk). Last week, the fan of the CPU of one cluster member had a defect, so this machine turned out.
In my opinion, the rest of the cluster had to run normal. But it seemed as if the other cluster members suspended. They couldn't be reached, even on the console you couldn't do anything. They worked again (without reboot) when the broken cluster member was back.
So, what could be the problem of it?
Regards,
Kirsten
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 07:15 PM
тАО06-18-2007 07:15 PM
SolutionHow is storage organizes in your cluster, do the remaining nodes have access to vital disks (e.g. the systemdisk), or are some disks served by the failing node?
If the quorum was lost, there should be messages on the consoles or in the OPERATOR.LOG.
regards Kalle
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 07:17 PM
тАО06-18-2007 07:17 PM
Re: Cluster suspended while one member had a defect
Do you have a Quorum disk ?
Can you post the votes of all the members ?
It seems the number of votes was under the quorum, so it may explain why the cluster hang.
It is a pity that you do not have AMDS or Availability Manager, as it tells you the quorum is not reached, and you can force a new value for the quorum, so the Cluster is again working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 07:39 PM
тАО06-18-2007 07:39 PM
Re: Cluster suspended while one member had a defect
The system disk is reachable from all machines in the cluster, so this couldn't be the problem.
It was a bit of mystique for me. In the moment, one cluster member was broken, all the other machines suspended. No entrie in the operator log for the reason. They worked again, when the broken cluster member was back. I couldn't understand this.
Short summary: 4 nodes in a cluster, no quorum disk, expected votes 3, each machine 1 vote.
Regards,
Kirsten
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 08:12 PM
тАО06-18-2007 08:12 PM
Re: Cluster suspended while one member had a defect
at least some lost-connection... messages should be in the OPERATOR.LOG.
Can you give some more background on your configuration, e.g. storage, interconnects...
regards Kalle
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 09:28 PM
тАО06-18-2007 09:28 PM
Re: Cluster suspended while one member had a defect
=(expected_values/2 +1) rounded down.
then the calculated value of quorum in your scenario is =3/2+1 =2.5 rounded down i.e. 2
So when in your cluster, if atleast two nodes alive ,your cluster should be up.I think you should check sysgen parameters (votes,expected_votes,QDSKVOTES) and also modparams.dat file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 09:37 PM
тАО06-18-2007 09:37 PM
Re: Cluster suspended while one member had a defect
regards Kalle
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-18-2007 11:26 PM
тАО06-18-2007 11:26 PM
Re: Cluster suspended while one member had a defect
your votes configuration is correct. For 4 nodes, the majority (i.e. QUORUM) is 3, so the cluster should continue, if only one node is lost.
It may be too late to find out, why the clsuter has apparently hung. Do you capture your console data with some console manager application ? If not, there should be at least some messages in OPERATOR.LOG - written once the 4th system came back again.
If this would happen again - and if it really has something to do with lost quorum, you could try the IPC interrupt on the console to recalculate quorum.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-19-2007 04:54 AM
тАО06-19-2007 04:54 AM
Re: Cluster suspended while one member had a defect
your formula is wrong.
The correct formula for calculationg quorum is:
quorum = (expected_votes+2)/2
In this case, (4+2)/2 gives 3, which is the correct quorum value for a 4 votes.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-19-2007 05:10 AM
тАО06-19-2007 05:10 AM
Re: Cluster suspended while one member had a defect
check and see if the votes/expected votes etc are what you think they are. when its running, do a..
$ show cluster/continous
add vote
add quorum
add cluster
that will show what the running cluster has.
see if that makes sense. Dean