TruCluster
cancel
Showing results for 
Search instead for 
Did you mean: 

Replacing quorum disk on TruCluster, all nodes are down

SOLVED
Go to solution
George Denyer_1
Frequent Advisor

Replacing quorum disk on TruCluster, all nodes are down

Hi,

I hope someone still reads this forum...

I'm facing a problem with a two node cluster that was completely shut down for maintenance and when it was started up again the quorum disk was damaged.

It should be very simple to change the quorum disk, but I have some doubts.

I was able to start one node without quorum using the steps from "TruCluster Server Handbook" (http://flylib.com/books/en/2.387.1.154/1/), basicaly booting with "boot -fl ai" and then using this options: "vmunix clubase:cluster_expected_votes=1 clubase:cluster_qdisk_votes=0 clubase:cluster_qdisk_major=0 clubase:cluster_qdisk_minor=0".

What I'm not sure about now is whether I should go through the quorum disk replacement procedure while the other node is down or not. I tried to boot the other node using the same steps and it got as far as forming the cluster but then it hanged and never finished booting.

On the TruCluster Server Handbook there is a procedure for replacing a failed quorum disk, but it assumes this happens while the cluster is up (http://flylib.com/books/en/2.387.1.195/1/).

My questions are then:
1) Should I use this procedure with only one node up?
2) If the answer to 1) is no, then how should I start the second node?

Any suggestions are appreciated. Thanks.

George

5 REPLIES
Mark Garrett
Advisor

Re: Replacing quorum disk on TruCluster, all nodes are down

I dont see how you facing a problem, you either have a problem or you dont

What happened when you just booted both nodes with the dead/unavailable quorum disk ?


what versions of things are we dealing with here OS/cluster versions.
George Denyer_1
Frequent Advisor

Re: Replacing quorum disk on TruCluster, all nodes are down

If I didnt have a problem, there would be nothing to face, would there?

This is Tru64/TruCluster v5.1B.

The message I got was:

CNX QDISK: Cluster quorum disk 19,82 has become unavailable due to a write error (status 60).
CNX QDISK: Cluster quorum disk 19,82 has become available again.

The nodes just kept repeating this message.

I found out the problem was the existence of persistent reservations on the cluster's volumes, including the quorum volume. I used a local installation to clean these with cleanPR and everything worked again. The person who shut down the cluster probably didnt do it right.

Still, if you know the answers to the two questions, I'm interested in knowing them.

Anyway, thanks for your help.

George

Pieter 't Hart
Honored Contributor
Solution

Re: Replacing quorum disk on TruCluster, all nodes are down

read the docs about quorum carefully.

qourum is used to keep the nodes from running the same cluster "separately".
quorum is calculated from the votes of each node. for this calculation the quorumdisk is viewed as a node.
If properly configured, a node in normal state cannot boot alone but needs another node to get enough quorum to boot.

A two node cluster can boot up with enough quorum to form the cluster without using the quorum disk.

- So you can boot a node + quorum disk
- you should also be able to boot two nodes without quorum disk.
- and offcourse 2 nodes + quorumdisk

The first node boots and waits to form the cluster until a quorum disk or another node is detected to form enogh quorum.

With both members up you can reassign the quorum disk.

Does this answer your question?
George Denyer_1
Frequent Advisor

Re: Replacing quorum disk on TruCluster, all nodes are down

Thanks Pietr, you did pretty much answer my questions.
George Denyer_1
Frequent Advisor

Re: Replacing quorum disk on TruCluster, all nodes are down

Solved.