Operating System - OpenVMS
1827854 Members
1537 Online
109969 Solutions
New Discussion

Re: Using Availability Manager on Windows to adjust VMScluster quorum

 
SOLVED
Go to solution
Clyde 10
Occasional Advisor

Using Availability Manager on Windows to adjust VMScluster quorum

Please confirm following Availability Manager setup will work for adjusting quorum:

- OpenVMS IA64 cluster with collector on each node
- Windows 2003 server num1 on same level 2 LAN as cluster nodes, with server and analyzer installed.
- Windows 2003 server num2 on different IP segment, as cluster nodes, with analyzer installed.

Users could use either either Windows 2003 server num1 or num2 to adjust quorum.

Using num1 the analyzer would communicate using AMDS protocol to cluster node to adjust quorum.

Using num2 the analyzer would communicate to num1 running server, which would then communicate to cluster node to adjust quorum.

Thanks.

20 REPLIES 20
Andy Bustamante
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

See http://h71000.www7.hp.com/openvms/products/availman/docs.html and http://h71000.www7.hp.com/openvms/products/availman/6552install.pdf specifically. This a valid configuration; however I see one potential issue. The documentation states Windows 2000 or Windows XP.

I don't see Windows Server 2003 listed as a supported operating system. I used to run AM on a Windows Server 2000 system and connect via VPN. I'd expect this to just work, but it may not be supported.

You're also allowing users make fixes from more than one management station. I would expect VMS to handle this gracefully, but would probably set up a single access point of access for users (sound like your Num2 system) as the point of access.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Steve Reece_3
Trusted Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

I'd be cautious before saying that the system on another LAN segment (presumably another subnet?) will work. Last I read (which is some time ago I'll admit) the protocols involved with Availability Manager weren't routable so you needed the machines to be on the same subnet or it wouldn't work.

This makes a degree of sense when you consider what happens with an inquorate cluster - all user processing stops and waits for the cluster to become quorate again. This includes all the TCP/IP stack processes. AvailMan needs to go in at driver level to intervene so can't rely on the TCP/IP stack to get itself out of the situation it's in.

I've not seen any documentation suggesting that AvailMan nodes can forward on fixes to other nodes which is what's being suggested in using num2 analyser to communicate with num1 running server to communicate with the cluster.

Maybe a better way would be to use num2 to remote desktop to num1 server and do the fixes to quorum from the num1 server?
Steve
Hoff
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

This question is sufficiently odd that there's almost certainly (far?) more here than is currently known.

The immediate response (and which is sufficiently obvious that there's probably something here preventing its use) VPN into the LAN and RDP or VNC the Windows box and be done with it.

I'ld not look to extend management outside the OpenVMS LAN segment. Regardless of whether it's feasible, or whether the TLS works here.

And yes, then there's the whole what-runs-on-what question with Windows. Try it. Let us know. The data server forwarding (yesh; yet another platform-specific and client-server protocol in VMS) is intended to provide this.

Why is the question odd? Beyond the sequencing and the Windows stuff, that if you even need to do this quorum reset, then this cluster is in need of some work or mayhap a quorum node. I'm going to guess either a misconfigured cluster, or a two-node configuration. Details on that would be interesting, too.
MichelleP_1
Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Steve, WAN support was added in Availability Manager V3.0-2. It adds a 3rd component, the "Data Server" that is run on a system in the same LAN as the hosts running the Data Collector. Then Data Analyzer can connect to a Data Server over the WAN.

I've been using this version (well, the 3.0-1 & 3.0-2 beta versions) for at least year and 1/2, but fortunately haven't had to do any "fixes" over the WAN, especially not quorum adjustment.

Clyde, your configuration sounds right to me(without getting into what Andy pointed out re: Windows versions, but I wouldn't expect an issue there). As Hoff suggests, give it a try. You could try one of the other "fixes", like adjusting a quota on a process, and watch the Data Server log.

I was thrilled when they added the WAN support. In many environments the system manager does not have direct access to a Windows system that is on the same LAN as the VMS systems. This allows the system manager to run the analyzer on their desktop. Personally, I run the Data Server on a VMS system management node.
Bart Zorn_1
Trusted Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

I have been running the beta versions of AM 3.0. The Server on a DS15 which sits on a dedicated vlan for all the OpenVMS nodes, and the analyzer either on a DS15 workstation or a Windows XP laptop far away from the datacenters. I have done many fixes this way, including adjusting process quota, killing runaway processes and adjusting cluster quorum.

Great tool!

Unfortunately, I lost my contract before V3.02 came out.

Regards,

Bart
Shubhabrata Bose
New Member
Solution

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Hi Clyde_10,

Here are my inputs:
- OpenVMS IA64 cluster with collector on each node
- Windows 2003 server num1 on same level 2 LAN as cluster nodes, with server and analyzer installed.
- Windows 2003 server num2 on different IP segment, as cluster nodes, with analyzer installed.

Users could use either either Windows 2003 server num1 or num2 to adjust quorum.
>>> Quorum adjustment is possible for the kit that will be acting as Analyzer for example, "num2" in tis case,because only analyzer contains the GUI screens for adjusting the quorums.Whereas the Server runs only as a terminal.

Thanks,
Shubh,
AM Engineering,
Peter Zeiszler
Trusted Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

I would definitely make sure to control WHO can login to Num1 and Num2 so that you don't get mysterious quorum adjustments.
MichelleP_1
Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

With the server running, one could run the analyzer on any workstation (provided they had the proper key), not just your two Windows servers. So (as I'm sure you have anyway) make sure you use the combination of keys and the Data Collector password to provide or restrict Read/Write and Read-only access.
Clyde 10
Occasional Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum


I wish to clarify my environment, and needs a little more here...

- OpenVMS IA64 dual site cluster with two IA64 nodes at each site, with voting set up as follows:
Site A (secondary site) - each node has one vote each
Site B (Primary site) - one node has one vote, one node has TWO votes
Expected votes = 5
Quorum calc (exp votes + 2)/3 = 3
Scenarios:
- site B will run when sites can not reach each other, and A will hang. Load will run at B, without intervention.
- Site B will run if A is down. Load will run at B, without intervention.
- site A will hang if B is down. NEED INTERVENTION HERE. <----

Other than the OpenVMS cluster, there are no other OpenVMS servers to rely on, to use for Availability Manager. So supported
Windows solutions/capabilities have to be explored and weighed to decide on go-forward solution, to bring up Site A under duress.

Each OpenVMS node will run Availability Manager collector on each node.

Method 1:
- Windows server num1 on same level 2 LAN as cluster nodes, with Availability Manager Server and Availability Manager Analyzer installed.
User would log into Windows server num1, and go into the Analyzer. User could then adjust quorum on the cluster, since the analyzer would
communicate with the OpenVMS node collector code directly.

Method 2:
- Windows server num2 on different IP segment, (not level 2 bridged to same network as cluster nodes), with only Availability Analyzer
installed. User would log into windows
server num2, go into analyzer, which would communicate to the server process on windows server num1, and then be able to adjust quorum
on the cluster. The details of this assumption are that the command to adjust quourum is entered on the analyzer GUI, gets passed via tcpip
transport to the server process on windows server num1, and then is passed to the Availability Manager code running on the cluster in
order to free the cluster at site A to run.

Users could use either either Windows 2003 server num1 or num2 to adjust quorum.

I believe what was stated here, says it will work. The version of windows to use is fuzzy, and I need to get official confirm/blessing of
2003 or 2008 support.

Thanks to all.
Hoff
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

If you'd prefer, you can throw all the complexity out here. To simply this configuration and just use the MP console. To get rid of Windows. Of AMDS. Of Availability Manager. Of the set-up and the network protocols. All of it. Gone.

Connect to the MP console, and invoke the IPC handler by entering a ^P on the console, and cancel the quorum hang.

Done.

http://labs.hoffmanlabs.com/node/1195
Clyde 10
Occasional Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

That is our fallback if needed. HP Cluster support said that the recommended/supported way
is to use Availability Manager. The biggest reason is that if you donâ t recover the cluster
fast enough it causes the node to exit the cluster. This has happened in the past during
our DR tests so that is why we went with Availability Manager as the solution especially
after support told us this was the recommended way.

When you hit ctlP on the console you have to recover the quorum by a certain time
with avail mgr this is not a concern since it doesn't use the console to adjust quorum.
It does actually take longer using avail mgr but there is no chance of an issue happening
because of not executing the recovery quick enough like what happens on the console.
It is acceptable for the recovery to take a little longer from Operations point of view
but a node exiting the cluster would be too long to recover from.
Andy Bustamante
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Your proceedure for using AM needs to cleanly define at what point this option is selected and coordination between the two sites. You don't want both sites running individually if the links are down for example.

You may also have a situation where you only site B is active but one node fails.

An alternate configuration may be to have a Num1 at site A, Num2 at site B both running AM server and analyzer modules. Install the client at authorized workstations.

Andy Bustamante

Andy Bustamante
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
John Gillings
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Clyde,

>The biggest reason is that if you don'Â Â t
>recover the cluster fast enough it causes
>the node to exit the cluster.

I suspect this is a misunderstanding. The CLUEXIT bugcheck occurs when surviving cluster nodes lose connection a node, which then attempts to reconnect to the cluster after more than RECNXINTERVAL seconds. The node attempting to reconnect is considered to have been missing too long, and is not permitted to rejoin. Default is 20 seconds, which can be increased for multi site clusters, but rarely more than 60 seconds.

CLUEXITs don't necessarily happen if quorum is lost. All nodes freeze until quorum is regained. During the freeze, nodes cannot detect that another node has been lost, so RECNXINTERVAL isn't relevant. Of course, the node who's loss resulted in quorum loss won't be allowed to reconnect, but that has nothing to do with the time to recovery.

The nodes which were cluster members at the time of the freeze will all be members when quorum is regained, regardless of the time interval from the freeze. The node which caused the quorum loss will CLUEXIT if it takes more than RECNXINTERVAL seconds to regain quorum and reconnect, but then if it had a connection you wouldn't need to adjust quorum, because the votes would become available again.

If I've misunderstood what you mean, please explain the sequence of events you think you're dealing with.
A crucible of informative mistakes
Clyde 10
Occasional Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

There are two sites, and when SITE B is down, dead and never to be heard from again,
then SITE A, with two nodes totalling only 2 votes will not have enough to have quorum and run.

At SITE A, if the user connects to the MP console, and invokes the IPC handler by entering a ^P on the console,
that will begin the clock running, as to the RECNXINTERVAL timer. So unless the user promptly acts, to adjust the QUORUM, the CLUEXIT
looms, and the horror of two sites both not processing could result. Using AM, apparently will neatly adjust the QUORUM, without diverting
the codestream, out of OpenVMS cluster code, which is keeping a wary eye on the timer settings and it's appropriate next response.

Thanks.
Hoff
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

If the ^P cancellation within the interval is too much for the folks (ok, certainly and entirely your call), then you could boot up your local Mr Big Votes simh VMS box on the LAN. Your very own VAX quorum bomb.
Clyde 10
Occasional Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Using simh on PC, to simulate VAX hardware for OpenVMS to run on, would provide votes to boost the quorum, but in our case OpenVMS/VAX could not cluster with OpenVMS/I64. Otherwise, this would be an interesting option to look into.

Thank you.
Hoff
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Quoting the SPD: "VAX and Integrity servers can exist in the same cluster ONLY for temporary migration purposes."

I'd think that booting a quorum bomb would count as a temporary solution.
Volker Halle
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Clyde,

OpenVMS VAX can cluster with OpenVMS I64 just fine. Just don't use V5.5-2 or lower, those versions will crash with CLUSWVER 'Software version incompatible with existing VMScluster' when trying to join the cluster.

For just providing a cluster vote, you don't need much shared resources between the OpenVMS I64 systems and the 'virtual VAX'.

We have been running a CHARON-VAX in our cluster - and a PersonalAlpha as well - since more than 5 years. Works fine.

Volker.
Clyde 10
Occasional Advisor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

Hoff: Thanks for noting that.
Yes, then simh/VAX emulation running OpenVMS is a solution for freeing a hung cluster.

Thanks to all.
Andy Bustamante
Honored Contributor

Re: Using Availability Manager on Windows to adjust VMScluster quorum

FYI you may also consider Charon Alpha http://www.stromasys.ch/products/ should your requirements force a fully supported product.

Regarding clustering VAX and IA64,in the words of a stealthly VMS Wizard, it's not going to be tested but we'd have to put checks in to keep it from happening.

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net