- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Changing votes in a cluster without a cluster rebo...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 04:31 PM
тАО05-18-2010 04:31 PM
Changing votes in a cluster without a cluster reboot
One old node has failed and wont be replaced. Now I need the quorum disk votes to maintain quorum. This is a 7x24 system and I can't reboot the cluster to adjust votes, expected votes, etc. easily. I can reboot one of the nodes without an application shutdown. My thought was to give that node 2 votes instead of 1. Then I'd have quorum with the 3 nodes (1 vote + 1 vote + 2 votes).
Because of, as yet diagnosed problems, the quorum watcher keeps loosing connection to the quorum disk (NAS array) and the cluster hangs for a few seconds.
Giving one node 2 votes doesn't fix the problem of losing connection to the quorum disk, but I'm hoping to mitigate the cluster hangs until the connection problem can be diagnosed and resolved.
I'm just looking at a sanity check on the proposed change so that it doesn't result in a split cluster.
Expected votes still = 7
Quorum disk votes still = 3
2 nodes with votes still = 1
1 node with votes = 2 (modparams changed and node rebooted)
Quorum still 4 (3 nodes up; or at least 1 node with the quorum disk votes).
When I can reboot the cluster, I'll adjust everything to be more relevant to the configuration.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 04:50 PM
тАО05-18-2010 04:50 PM
Re: Changing votes in a cluster without a cluster reboot
The delay for a cluster transition would be 3*QDSKINTERVAL when the votes from the quorum disk are required.
Given this environment could well involve a network problem, acquiring host-based votes without depending on the quorum disk may not be a sufficient salve for the instabilities. Clusters tend to be unstable when the network is unstable, and partitions (and hangs) can arise.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 06:39 PM
тАО05-18-2010 06:39 PM
Re: Changing votes in a cluster without a cluster reboot
In the new setup you would have 3 nodes and a Quorum disk.
Quorum disk is generally recommended in case of a 2 node cluster.
In the new setup, you can have 1 vote each for the 3 nodes and do away with
the Quorum disk.
In this case,
Each node would have Vote = 1
Quorum = 2
Hence, If any two of the nodes are up then the cluster would be up.
Even in the new setup, do you want a configuration such that,
even if one node is up, you want the cluster to be up ?
Regards,
Murali
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 07:10 PM
тАО05-18-2010 07:10 PM
Re: Changing votes in a cluster without a cluster reboot
The following link talks about OpenVMS cluster concepts.
http://h71000.www7.hp.com/doc/731final/4477/4477pro_002.html
This talks about Votes/Quorum/Quorum disk/Quorum disk watcher
and also rules for Specifying Quorum.
Regards,
Murali
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 07:16 PM
тАО05-18-2010 07:16 PM
Re: Changing votes in a cluster without a cluster reboot
In particular, for cluster configurations that are intended to survive the loss or downing of comparatively large numbers of hosts. A three node cluster that can survive with two of the nodes down, for instance, and as parallels this case.
With hardware RAID underneath the quorum disk and with comparatively short polling timers, these configurations do work nicely for the specific requirements.
With a short timer value, the cluster will have three (polling at 1 second intervals) to six (at 2) to nine (at 3) seconds for a transition if/when the votes from the quorum disk are needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-18-2010 07:32 PM
тАО05-18-2010 07:32 PM
Re: Changing votes in a cluster without a cluster reboot
>> A three node cluster that can survive with two of the nodes down, for
>> instance, and as parallels this case.
Yes, i get your point. You have give a good example as to why quorum disk
would be required in general (for any set of nodes in a cluster).
And this is exactly the reason why carleen is using the quorum disk in the
cluster setup.
Regards,
Murali
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2010 02:12 PM
тАО05-23-2010 02:12 PM
Re: Changing votes in a cluster without a cluster reboot
With 4 Alphaservers, set expected votes to 2 and the failure of one machine would mean that at least 2 others need to be present to form a cluster. If you have a greater number of machines all with equal votes, then just divide the number of nodes by 2 to get your expected votes.
If you have an odd number of machines and the loss of 1 leaves you with the potential to have two "disjoint clusters" with an equal number of nodes, just push expected votes up by 1. (Or round up after dividing your total nodes by 2.)
The only time that I see quorums disks as useful is when you have just a two node cluster and want them to always function as a cluster even when just one node is operating.
The above relies on having an equal number of votes from all machines, which might not be true if you have a mix of hardware/capabilities. Just be aware that the aim is to avoid a disjoint cluster and then do your calculations about the node combinations that you would be prepared to accept (e.g. loss of big machine means that two smaller machines must be present to take the load.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2010 05:36 PM
тАО05-23-2010 05:36 PM
Re: Changing votes in a cluster without a cluster reboot
Usually because you want the cluster to continue running with fewer hosts present than would otherwise be possible with the traditional configuration.
> Isn't the intention of the qurum disk to prevent a cluster being divided into 2 or more parts, each of which "thinks" it's a cluster in its own right?
That's the quorum scheme you're thinking of. And of cluster partitioning. Of which the quorum disk is a component.
The quorum disk is a cheap and surprisingly effective surrogate for a voting node.
If you're curious about this stuff, run the math for the quorum values and the configurations for the degraded-cluster cases; calculate the quorum for the various configurations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2010 08:09 PM
тАО05-23-2010 08:09 PM
Re: Changing votes in a cluster without a cluster reboot
>> Isn't the intention of the qurum disk to prevent a cluster being divided into 2
>> or more parts, each of which "thinks" it's a cluster in its own right?
This is what even i had in mind. Quorum disk being useful to prevent cluster
partition or in case of a 2 node cluster configuration. But if you look at the
response from Hoff and also the requirement of carleen, the overall
requirement is that, even if one node in a cluster is up then the whole cluster
should be up.
Carleen's proposed changes,
Qourum disk : 3
Node A : 1
Node B : 1
Node C : 2
Qourum = 4
In this case, even if two nodes goes down then the votes of 3rd node combined
with the qourum disk would meet the cluster qourum and hence the cluster
would be up.
Hence, the use of qourum disk is to ensure that even if multiple nodes go down
(worst case - only one node may be up), then also the cluster would be up.
This would not be possible if there were no quorum disk.
Regards,
Murali
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2010 01:03 AM
тАО05-24-2010 01:03 AM
Re: Changing votes in a cluster without a cluster reboot
I am in agreement with Hoff.
From my perspective, there are two simultaneous questions:
- How does one prevent the cluster from partitioning into two parallel clusters?
- What is the minimum set of nodes needed to ensure the continuation of cluster operations at a minimal level?
The conventional answer of "A quorum disk iff (and often ONLY IF) there are precisely two nodes in the cluster" is adequate, but not complete. It is correct in that the two-node cluster requires a quorum disk to allow cluster operations to continue unimpeded when only a single node is operational.
However, the two-node case is not the entire story. A quorum disk is also required if cluster operations must continue unimpeded when the population of cluster members reduces to either one or two nodes due to failures.
The latter case requires careful review of the votes to ensure that partitioning cannot occur.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2010 06:28 AM
тАО05-24-2010 06:28 AM
Re: Changing votes in a cluster without a cluster reboot
To clarify Murali's statement: the use of the quorum disk trades off slightly longer cluster transition times to avoid the (usually much longer) operator-initiated quorum transitions. (Three times your quorum disk polling interval, which means you can transition in, say, 3 to 9 seconds with a quorum disk with suitably stable networks and associated quorum-related system parameters; with polling at, say, 1 to 3 seconds.)
An experienced and savvy operations staff can degrade a cluster without the use of a quorum disk, but will need to adjust votes on the way down and must manually remove the blade-guards that avoid partitioning when dropping into or below a quorum=2 configuration.
And the manual transitions are for naught if you can't sequence orderly host shutdowns for whatever reason.
Both the quorum hangs and these manual quorum adjustment can adversely effect applications. During the transition and prior to the manual input, processes that need quorum (which is most of them) will be blocked by the OpenVMS scheduler. (In the more ancient of times, IPL was used to block activity during the quorum hang. That changed at V5.2.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2010 08:24 AM
тАО05-24-2010 08:24 AM
Re: Changing votes in a cluster without a cluster reboot
One utility which allows allows you to force quorum to be recalculated on the fly is Availability Manager. See http://h71000.www7.hp.com/openvms/products/availman/index.html. AM won't help force quorum when rebooting a cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2010 03:12 PM
тАО05-24-2010 03:12 PM
Re: Changing votes in a cluster without a cluster reboot
Personally I think the minimum acceoptable number of nodes is an issue that's defined according to business requirements. If the same applications were available on all nodes of a 4-node cluster then there'd be a lot of redundancy if provision was made for 3 nodes to fail leaving 1 node to run all tasks for that application.
Maybe applications are only available on certain nodees, in which case the failure of some nodes would disable certain applications, which is fine if you can live with it but maybe some applications are critical.
It all comes down to a business issue of what services should continue when there's a partial failure, bearing in mind that the failures could be any nodes. A totally homogenous cluster - all applications available on all nodes - has advantages in the case of node failure. Setting expected votes to just over half of the total machines is a reasonable general policy for a homogenous environment because unless you have a lot of redundancy the remaining nodes will likely be getting overloaded because they'll have double the workload.
The important issue is to configure the votes and expected votes in a way that corresponds to the requirements of the business.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2010 06:50 AM
тАО05-25-2010 06:50 AM
Re: Changing votes in a cluster without a cluster reboot
This cluster should, indeed, remain up with only 1 node (and the QDSK votes). As John indicated, the business requirements dictate this setup.
The cluster resides in a lights-out data center, so immediate operator intervention in forcing a recalculation of quorum is not an option. We have no operations staff at all, savvy or otherwise. Any response to a cluster problem is via a page to an on-call person, and off-hours, response can be at best 10-15 minutes which is too long to rely on maual intervention.