Operating System - OpenVMS
Showing results for 
Search instead for 
Did you mean: 

Long Distance Cluster

Go to solution
Respected Contributor

Long Distance Cluster

I'm currently testing a two-node cluster separated by about 1200 miles. Each node is a DS25, 7.3-2, gigE, and MSA1000 storage. No quorum disk/system (2 votes local, 1 vote remote).

The local node is "production" (developers actually, since this is just testing right now) and the remote node is a DR node. There are no users and bare minimum processes on the remote node.

I'm currently shadowing two volumes between the two nodes.

I'm trying to squeeze every bit of performance out of the cluster and would welcome any suggestions from people who've implemented long distance clusters.

One question I currently have is regarding LOCKDIRWT. I currently have LOCKDIRWT set to 1 on both nodes. I'm thinking of setting LOCKDIRWT to 0 on the remote node. Opinions?
Honored Contributor

Re: Long Distance Cluster

Wander over to DECUServe, and take a read through 2646.* in the HARDWARE_HELP notes conference. It's very similar to your question.

If the remote is for recovery only, I'd tend to keep the locks and the votes on the primary node. Given typical activity, the primary is where the locks will likely end up anyway, though there are some enhancements in V8.3 (LOCKRMWT) that help reduce any lock flailing that might arise. (I'd probably only tweak LOCKDIRWT if I found the locks were being remotely mastered with any regularity.)

As for votes, there's no point in having a two and one voting configuration -- make it one and zero.
Robert Gezelter
Honored Contributor

Re: Long Distance Cluster


I would also ensure that the scratch space (SYS$SCRATCH) is pointed to a local device. Doing scratch operations across a wide area link is sure to cause a performance problem. I had some examples of that in my OpenVMS Technical Journal article, "Inheritance-based Environments for Standalone OpenVMS Systems and OpenVMS Clusters". You can get a reprint of the paper from the OpenVMS WWW site or my firm's site (http://www.rlgsc.com/publications/vmstechjournal/inheritance.html ).

Since this is a DR situation, I would also pre-configure and test the alternate site's ability to continue operation. I would pre-configure the various settings in an alternate system root to minimize the handwork that is needed in the event of an emergency).

- Bob Gezelter, http://www.rlgsc.com
Martin Hughes
Regular Advisor

Re: Long Distance Cluster

Personally I would have LOCKDIRWT at 1 on both nodes, and use PE1 to stop locks moving off the primary node. Set PE1 to whatever you see as appropriate. The advantage is that PE1 is dynamic, so you can swap this around if the need arises (such as DR testing for instance).
For the fashion of Minas Tirith was such that it was built on seven levels, each delved into a hill, and about each was set a wall, and in each wall was a gate. (J.R.R. Tolkien). Quote stolen from VAX/VMS IDSM 5.2
Art Wiens
Respected Contributor

Re: Long Distance Cluster

"use PE1 to" ...

Can someone elaborate (or provide a pointer) on these "special" parameters? The System Management Utilities manual only says:

PE1, PE2, PE3, PE4, PE5, PE6 are reserved for HP use only. These parameters are for cluster algorithms and their usages can change from release to release. HP recommends using the default values for these special parameters.

Defaults are 0 for all 6.

Just curious,
Respected Contributor

Re: Long Distance Cluster

I think the simplistic definition for PE1 is that any lock tree with more locks than PE1 value is not remastered. I'm no longer sure where this was documented. Sorry.
John Gillings
Honored Contributor

Re: Long Distance Cluster


Stating the very obvious... you're well over the maximum supported distance for a cluster (last time I looked it was 800km =~ 500 miles).

If your primary objective is performance, then LOCKDIRWT *should* be irrelevant. All activity related to a particular resource should be isolated to one lobe of the cluster. Otherwise round trip locking activity will slow things down significantly. You don't want to block lock tree migration because if you somehow end up with a tree at the wrong end, it won't ever find its way back! I'd be leaving LOCKDIRWT at 1 on both nodes.

That may require some drastic measures, like not mounting volumes across the wire. Your shadowsets are an obvious exception, but you should also make sure that no processes on the remote end touch them, and ALL mounts and shadow operations are initiated from the production end. Similarly all production activity should be started on the production end. This should help isolate the locking.

Run MONITOR DLOCK at both ends. Look for any remote locking TO the DR end, try to work out what is causing it and, if possible eliminate it, especially if you're seeing processes in RWSCS state. Use SDA to confirm that locks are mastered where you want them. More generally, make sure you understand the locking behaviour of your application.

You need to think very carefully about the sequence for failing over to the other end, and the fail back (how to make sure the lock trees move back?). How are you going to deal with quorum issues? Are there any circumstances where you fail the apllication over WITHOUT losing the primary node? If you want to maintain performance you will almost certainly have to sacrifice some recovery time.

Don't paint yourself into a corner thinking about only one node at each site. Make sure whatever you do will scale to multiple nodes. Think in terms of limiting the amount of lock traffic on the LONG wire, while allowing capacity to expand locally.

re: Art's question about PE1 - yes it's a reserved parameter. It's been used as a hack fix for lock tree thrashing, as Edgar described. It wasn't documented because it's always been a short term workaround, awaiting a better, more permanent fix. We now have that with LOCKRMWT in V8.3.
A crucible of informative mistakes