Re: Using LIB$BBSSI and BBCCI for locking

John McL · ‎05-04-2009

I'd like to move certain code from using VMS locks to using LIB$BBSSI and LIB$BBCCI. (In this case the locking operates on the same node, not cluster-wide on our 4-node cluster.) I have an old copy of the internals book for Vax/VMS v5.2 and it might not be entirely accurate for Alpha or IA64 so I want to throw some questions to the forum.
(a) What's the overheads?
(b) What's the performance gain over normal Locks (quantified, not subjective e.g. about 100 x faster)?
(c) Any gotcha's I should be aware of?

We run several hundred images and have a large number of users, so contention can be a real issue.

John Gillings · ‎05-04-2009

Hi John,

I assume you're talking about implementing your own spin locks?

LIB$BBSSI works the same on all architectures, but the underlying instructions are different.

Performance gains or losses depend on contention. If you have very little contention, and short critical regions, that is most requests for the lock are granted immediately, and locks are only held for a short time, then the performance of spinlocks can be exceptionally fast. But, if you have high levels of contention, and/or long critical regions, then performance can be terrible, with waiting processes burning CPU. As more processes join the mix, the worse things get.

It's impossible to give you a simple number as it's entirely dependent on load and contention.

Gotchas? There are lots and lots of them. Just looking at one - priority equalisation. Spinlocks don't work well with processes at different priorities. On uniprocessors this is fatal, as a higher priority requesting process will starve out a lower priority process holding the lock. Deadlock and dead system. On multiprocessors this isn't necessarily fatal, unless you get a low priority process holding the lock and N higher priority processes requesting it (where N is the number of available processors), but you can still end up with strange behaviour.

To work around this you should equalise priorities while requesting the lock, so there's your first overhead. Two calls to $SETPRI for each lock request. If you decide to skip this, assuming all your processes will always be at the same priority, you'd better make sure it's clearly and LOUDLY documented, or someone somewhere down the track will start a batch job at priority 3 and everything will break!

NUMA can also do odd things, causing asymmetries.

Typically synchronisation mechanisms are layered, so that the lowest level, "busy wait" mechanisms are used only for very short duration locking of data structures that implement higher level mechanisms like semaphores or VMS style locks. You can use this principle to build your own mechanisms, but you'll soon discover you're just reimplementing the lock manager.

This is non-trivial stuff... The worst issue is you won't necessarily know if there's some tiny timing window waiting to catch you at the least opportune time.

I'd want to see some very strong evidence that there is a real problem with the lock manager before delving into lower level synchronisation mechanisms. What do you think can be improved?

If you really want to do this, I'd recommend building a layer of code that implements an "ideal" locking API for your application, without revealing the underlying mechanism.

Implement it first using the lock manager, for simplicity and robustness. Once that's working, implement a version using spin locks and see if you get any measurable improvement.

A crucible of informative mistakes

Hoff · ‎05-05-2009

ITRC glitched again; this is a second attempt to post this.

I'm with John G. here.

Though there are no details on the complexity of the application, this switch-over is usually and likely a large project in an application with several hundred images and with masses of active users, and you'll need to ensure you're removing the right roadblocks.

Some idea that locks are the limiting issue here would be a requirement, and a look at moving toward sharding or toward finer granularity of the locks would (also) be in order; at re-architecting the locks required and the lock sequences. In particular, make sure you don't actually have a critical-path problem here; a case where you have critical code sequences. (qv: Amdahls' Law, et al. http://labs.hoffmanlabs.com/node/900 and http://labs.hoffmanlabs.com/node/638 among others.)

In addition to the priority inversion deadlocks John G mentioned, there are more direct deadlocks that can (also) arise here and you'll want or need to code deadlock scans for those. (The lock manager does these scans for you.)

Various applications use sequences of lock acquisition and conversions and releases, and use locks as notification "doorbells" in various designs; features that aren't available with bitlocks. You'll need to find any of those, and figure out how to implement the notifications.

BBCCI and BBSSI also involve the memory controller on some of the boxes; you'll need to be careful around the controller granularity and the bitlock memory locations here as you can end up with subtle lock contention. (With OpenVMS on Alpha or Itanium, you are hitting far fewer instructions than with the lock management calls, though you're getting a memory barrier or a memory fence; these calls are lighter-weight.)

This migration to bitlocks also means you're single-host from here on out, or that you are now rolling your own distributed synchronization. Or both.

For an application structural change this fundamental, I'd likely want to look at the whole design of the application, and instrument the current environment. (This work is a sizable chunk of a full platform port, in practical terms.) If a locking rewrite is on the table, then the whole design of the application is (also) on the table. And I'd also look at where I wanted to end up longer-term, whether that's an application locking layer, or a redesign of how the data rolls and roils through the application environment.

The abstraction layering John G. mentions is a classic OpenVMS application design. That's entirely reasonable here and I might well look to go further here, given the scale of the changes that are involved.

Robert Gezelter · ‎05-05-2009

John,

I have to agree with John and Hoff: Be careful. There is much potential for a large cost with debatable gain.

The first questions that I would ask are:

- Is a lot of time being spent processing locks?
- Is there a lot of contention?

If the delays are being caused by contention, then the gains by changing mechanisms are limited. The solution to contention is not to change the locking mechanism, but to take a careful look at what is protected by what lock and break that into different locks. This was seen in the changes in recent TCP/IP Services releases relating to IOLOCK8. At the user level, the concept is the same.

Performance monitoring using T4 or similar tools to gather statistics is paramount as a first step. If the performance monitoring shows that locking is an issue, the sequence of steps is:

- Tune Lock Manager performance
- Consider the use of Dedicated Lock Manager (a CPU in a multiprocessor dedicated to running the Lock Manager).
- Review the relationships between Lock Manager resources and whether this is creating contention needlessly
- Only then consider whether one should use low level spin-lock mechanisms

The above sequence also corresponds to an approximation of the cost and risks associated with each set of measures. Tuning is lowest risk and least expensive, a full re-structuring of the code and debugging of spin lock mechanisms can be expensive and very demanding.

- Bob Gezelter, http://www.rlgsc.com

John McL · ‎05-05-2009

John, Hoff and Bob, thanks for your comments and I'll certainly reflect on them.

Myabe if I tell you a little more about the situation you'll better understand where I'm coming from and why I'm considering bitlocks to replace some, but certainly not all, of our locking.

We have a 4-node Alpha cluster that during evening processing of batch jobs has its CPU's all running at about 100% for about 6 hours while processing batch jobs. The ENQ/DEQ rate for much of this time seems to average about 50,000 per second and MPSYNCH on one processor (or maybe one node, can't recall right now) is around 90%.

There are instances where the locking is trivial - e.g. to assign space in a table in a global section (obviously on one node) - so I'm investigating whether situations like this would be better as bitlocks rather than bouncing lock information around the cluster.

One issue not mentioned yet is the release of locks should processes die but that's something we can handle through our process monitoring tools that identifies dead processes and releases resources.

Yes, John G, I was plannning on having this in functions that other code calls rather than scattered across and/or duplicated in several images. That's the only sensible way to do it both for maintenance and tweaking internal monitoring code.

John McL · ‎05-05-2009

I forgot to add ... re-architecting the application is not possible in the short term because it's just too big. I also doubt if the cost/effort of the change could be justified in the long-term. I'm looking to just plug in a new faster module and then switch certain components to use this module rather than the old one.

John Gillings · ‎05-05-2009

John,

If you're already getting high levels of MPSYNCH using locks, chances are it would only get WORSE with spin locks covering wider critical regions. Why? At the moment, a process that is waiting for a lock request is not consuming CPU. The MPSYNCH you see is the time spend spinning on OpenVMS spinlocks, waiting to access the lock structures. Spinning for the whole duration of the lock request will be much worse.

Look at the granularity of the locks, and try to subdivide the objects of contention. Reduce MPSYNCH and increase parallelism.

Another thing to consider... if these are all batch jobs, what would happen if you ran them sequentially? That might eliminate the contention altogether. It's entirely possible you will complete the sequence faster than running them all in parallel.

A crucible of informative mistakes

Hoff · ‎05-05-2009

Fire up DECset PCA or analogous, and see where the applications are actually spending more of the time. Find your critical (slowest) code paths.

Only knowingly replicate "dumb"; don't blindly do so.

And don't blindly replicate an older application design.

There have been cases I've worked were it was far faster to load the whole data store into memory and run with it; disks and files are a convenience for restricted virtual and physical memory, after all. Stuff was designed prior to 64-bit addressing, and when a couple of gigabytes was Big Physical Memory.

Ensure you've properly segmented your cluster and your host-local locks, too. If your global sections are host-local, then embed the host name or such into the lock resource name. Keep the locks and lock trees local.

I'd look to spend time increasing the scope of what is locked or reducing the critical path code (once that is known); looking to tweak the current model. Before I started a locking rewrite.

Then look to get rid of the allocation of space if you can. Or reduce the number of times the application needs to go after it. This could be using sharded or cached allocation of storage, or going to interlocked queues and lookaside lists of allocated or deallocated, or going after bigger hunks.

There are tools around beyond PCA, such as the LCK extension in SDA, and DECamds/AvailMan that can be useful, too.

And do the due diligence involved with tuning; look for overloaded disk spindles and such.

John McL · ‎05-05-2009

How I miss good telephone conversations with knowledgeable Digital TSC people!

The trivial instance that I mentioned involves looking through a table of 1000 entries. Since space is assigned once to each process the potential for contention is minor until the system is heavily loaded, but Lock Manager always sends its information around the cluster.

Modifying the granularity sounds useful but there could be significant work in modifying the code and in testing. If we merely split something into smaller portions it might be necessary to lock and access multiple portions until the desired item is found.

I'm already considering our options for reducing that batch processing load and trying to identify the costs and benefits of each.

One point you've not commented on is whether 50,000 ENQ/DNQ operations per second is high, normal or low when running a whole heap of batch jobs.

John Gillings · ‎05-05-2009

>50,000 ENQ/DNQ operations per second is
>high, normal or low when running a whole
>heap of batch jobs.

On a VAX it might be an issue, but on an Itanium it's no big deal, especially if the locking activity is local. See MONITOR DLOCK.

On the other hand, are they really ENQ/DEQ? If a single process deals with the same resource multiple times, you might consider an ENQ NL at the start, then use lock conversions to synchronize. DEQ when you've completely finished with the resource.

>but Lock Manager always sends its
>information around the cluster.

Not true. You need to check the resource name cluster wide, but once you have a lock on a locally mastered resource, there is no further external activity.

Keeping all the interested processes on a single node should keep the resource local. If the resource is a global section, then that should already be true.

A crucible of informative mistakes

John McL · ‎05-05-2009

Hoff, John

I have to say that I like these forums for the input from multiple sources and the speed of response but they're not great from a "business security" aspect and I don't want to go into too many specifics about the applications.

This consideration of locking is just one of several avenues that I'm looking at regards performance. We'll go from Alpha to IA64 one day but going there with highly efficient code makes good sense.

Hoff, the application and its data files are just too big to load into memory. We did consider installing images /RESIDENT but that comes with its own set of problems (e.g. replacing them, memory fragmentation).

You both talk of ensuring that a resource name is recognised as unique to a node when we want the locking to really only apply locally. Do I understand you to mean that all the activity for the lock for the resource in question is seen by Lock Manager to be only on one node and therefore that the sending of data between nodes for this lock is very minor?

This could be something worth checking and if code modifications are required they should be minor.

Hein van den Heuvel · ‎05-05-2009

John,

It sounds like the lock activity you intend to target is just a fraction of the real lock volume.

50,000 locks per seconds is piddly... for a GS1280 with Ghz CPU's. A 300 Mhz 4-CPU ES40 on the otherhand might be stressed by it.

What is the box? How many CPUs/How fast?
Are you using a dedicated lock manager already?

What OpenVMS version?

Did you check with ANALYZE/SYSTEM and SYS$EXAMPLE:SPL.COM where MPSYNC time might be burned for real rather than speculating?

What are those batchjobs doing? RMS? Indexed files or scanning though sequential files? Oracle? Do they compete for shared resources?

IMHO You need to dive a lot deeper into figuring out what the problem is befor contemplating on a solution.
Unless you know a lot more than you are letting on so far, this LIB$BBSSI stuff sounds like a fun but random approach. So far it sounds much like a suggestion to 2 Platinum tipped sparkplugs in a car to make it run faster... without knowing how many cylinders there are. But hey, Platinum is some mighty fine raw material. That oughta help!

Grins,
Hein van den Heuvel ( at gmail dot com )
HvdH Performance Consulting

John McL · ‎05-05-2009

Thanks Hein, I'll come back to your comments later. For now though, a question or two to John and Hoff.

What determines whether a lock is seen as node-specific or cluster-wide? The flag LCK$_SYSTEM looks more like a SOGW type of thing.

If I have the same lock name being used in code on all nodes will the lock name be regarded as cluster-wide even though my application in each case is really node-specific?

John Gillings · ‎05-05-2009

John,

>What determines whether a lock is seen as
>node-specific or cluster-wide?

Make sure you understand the difference between the LOCK and the RESOURCE. It all comes down to naming and usage.

Resources are inherently, and always cluster wide. When you request a lock against a resource, the first step is to find which node is mastering the resource. That's a directory lookup. If it's not local, or new, a request always goes out to the cluster. If it's a new resource a master is decided, usually the local node. A locally mastered resource gets a local lock. A remote resource gets a local lock, and a "proxy" lock on the remote node.

If all the interest in a particular resource is on a single node, then all the locking activity will be local.

>If I have the same lock name being used in
>code on all nodes will the lock name be
>regarded as cluster-wide even though my
>application in each case is really node-
>specific?

Yes, but it's a RESOURCE name. Locks don't have names.

I think Hoff was suggesting you add the local node name to the resource name to make sure you don't have multiple nodes declaring the same name for what are actually different resources (ie: local global sections). Having an apparently single resource would "work", but you'd be unnecessarily single threading access to all global sections across all nodes in the cluster.

If you're concerned about inter cluster lock traffic, simply starting all interested applications on one node is a good start to elimination.

Worst case is establishing the resource on one node, then doing all the accesses from another. Usually lock tree migration will move locks to the node with the most activity, but that can cause trouble if the activity moves around (less of an issue in the latest versions of OpenVMS, there's been a lot of work to make sure migrations behave well).

A crucible of informative mistakes

Jonathan Cronin · ‎05-06-2009

With regard to Hoff's suggestion to include the host in the resource name, you should keep in mind that you can also use parent locks to create "namespaces" to avoid collisions on resource names.

Volker Halle · ‎05-06-2009

John,

regarding lock activity and resource trees, the LCK SDA extension may provide some useful insight:

$ ANAL/SYS
SDA> LCK SHOW ACTIVE

Volker.

Hoff · ‎05-06-2009

Here's some reading on locking and lock trees and lock resource names and lock states:

http://labs.hoffmanlabs.com/node/492

And as for contact with a support organization as a sounding board, there are still folks around that know this sort of stuff and that are in the business of fielding these sorts of calls. HP very likely still offers this service, as do other entities.

Jon Pinkley · ‎05-06-2009

John McL,

This question is somewhat like asking if it is better to use RMS indexed files or SYS$IO_PERFORM. It depends.

Now to your specific questions:

(a) What's the overheads?

The overhead of a call to SYS$ENQ will under almost all circumstances be higher than a call to either LIB$BBSSI or LIB$BBCCI (assuming the memory being modified is isolated from other use). But comparing the two is like comparing a routine to update an account balance with an add instruction. The point being that SYS$ENQ does much more than LIB$BBSSI. LIB$BBxxI is a very low level operation compared to what SYS$ENQ does.

(b) What's the performance gain over normal Locks (quantified, not subjective e.g. about 100 x faster)?

In my opinion, this an unanswerable question, since LIB$BBxxI isn't a locking solution by itself, and we have no idea how you intend to implement your locking protocol. In fact you have told us very little about the real requirements. Specifically, what do you plan to do if the "acquisition of the lock bit" was not successful? Yes, you can just keep trying, but that doesn't scale when there is high contention. And if you want any semblance of FIFO, you will need to implement some sort of queuing. Now your "simple" locking starts becoming not so simple.

(c) Any gotcha's I should be aware of?

Others have covered this. The biggest gotcha is that implementing a locking protocol that will work under varying conditions appears to be easy until you do it. Then you will start to rediscover all the potential problems.

>>>-------------------------------------------------------------------
There are instances where the locking is trivial - e.g. to assign space in a table in a global section (obviously on one node) - so I'm investigating whether situations like this would be better as bitlocks rather than bouncing lock information around the cluster.
...
The trivial instance that I mentioned involves looking through a table of 1000 entries. Since space is assigned once to each process the potential for contention is minor until the system is heavily loaded, but Lock Manager always sends its information around the cluster.
<<<-------------------------------------------------------------------

Are you really worried about optimizing something that is happening once per process? The cost of the initial "expensive" $ENQ that specifies the resource name is trivial compared to the cost of creating a process.

If space is assigned once to each process, and it is a fixed size, then why not just forget locking and use the process index as an index into the global section. You are guaranteed that two processes won't have the same process index, and you could write the IPID into the process specific portion so you could detect if the process that wrote the stuff in the entry was still around.

I have to agree with John Gillings, Hoff, Hein and others. Make sure that locking is really the problem before deciding that getting rid of $ENQ will solve your problems.

If you have a copy of the VAX/VMS 5.2 IDSM, perhaps you have a copy of "VAXCluster Principles" by Roy G. Davis. If so, read chapter 6 on the Distributed Lock Manager. That is one of the best descriptions of the Lock Manager I am aware of. It's a bit dated, but the principles of the lock manager are still basically the same as described there or in the 5.2 IDSM. There have been enhancements to functionality and optimizations in cluster messaging, lock remastering, etc. with newer versions, but the basic steps are the same.

Jon

it depends

John McL · ‎05-06-2009

(If this appears multiple times please don't blame me. Each time I hit "submit" the webpage went blank. I've waited a while between re-posts but perhaps they are queueing up somewhere.)

What started out as an idea to move some trivial (short-term) locking to bit-locks has grown into something else.

I've checked the lock in question and for historical reasons it was clusterwide, but never got changed when there was no longer a need to be. I was correct in believing that the lock was causing inter-node traffic but I hadn't considered that processes on other machines might be getting blocked because their own local copies of the data structure in question were getting locked when they didn't need to be.

We'll now modify it to refer to a node-specific resource name and should see less inter-node traffic, lower MPSYNCH and processes on other nodes continuing when they would otherwise be blocked.

Just in case anyone is still considering bit-locking, what happens if the locking process aborts? How do you clear the lock? The Lock Manager will sort that out for you.

Hein and Jon, I don't think there's much point in responding to your questions (but don't worry, I'll still award points).

Bearing in mind that these forums should be a resource for anyone seeking solutions I'll keep this open for about 24 hours in case anyone wants to add anything to it.

Volker Halle · ‎05-06-2009

John,

Just in case anyone is still considering bit-locking, what happens if the locking process aborts? How do you clear the lock?

Just an analogy: if you consider OpenVMS SPINLOCKS, the system will crash with CPUSPINWAIT, if some component locks a spinlock and goes away without unlocking it.

So you also need to design timeouts and error handling into this mechanism.

Volker.

Hein van den Heuvel · ‎05-06-2009

>> Hein and Jon, I don't think there's much point in responding to your questions

Correct. If you want only want to answer the questions you raise in the base topic, then there is absolute no reason to try answer our question.

What is the problem you are trying to solve?
a) learn about all about LIB$BBSSI
b) make the system better.

If the answer is b) then I believe our questions are rather pertinent. You may very well be correct that the low level interlocks are teh ultimate bottleneck. However, you have provided not a single morsel of pertinent data to support that notion.
It may be very obvious to you, but we don't have much to go on, trying to help you with 'the real problem'.

So kindly humor me and give me no points, but 2 lines of answer as to
- whether the real problem is with the large number of concurrent users as per original problem description, or those 6 hours batch job as per later.
- an indication of what those user/jobs might be doing. More towards real-time, in memory cell phone message routing and accounting, or perhpas more towards RMS/Oracle based OLTP-ish or reporting jobs?
In the first case I have good hope for your approach. In the latter case, I have no hope you will make a measureable difference.

Kindly explain a little more about:
"We have a 4-node Alpha cluster that during evening processing of batch jobs has its CPU's all running at about 100% for about 6 hours while processing batch jobs. The ENQ/DEQ rate for much of this time seems to average about 50,000 per second and MPSYNCH on one processor (or maybe one node, can't recall right now) is around 90%."

You indicate near 100% CPU use. What MODE?
All USER? EXEC? KERNEL?

Cheers,
Hein.

Keith Parris · ‎05-07-2009

First thing I'd do is see where the lock requests are coming from, to see if it's even your own locks that are generating the high rates.

SDA> LCK SHOW ACTIVE will list the most-active lock trees in descending order by lock rate, and you can gain clues as to what a lock is used for by its resource name. (VMS lock usage is described in an appendix of the Internals & Data Structures book.)

Christian Moser produced LCK SHOW ACTIVE after seeing the output from my LOCK_ACTV*.COM tool, available from the V6 Freeware CD directory [KP_LOCKTOOLS]. I prefer my tool, as it provides the same info but on a cluster-wide instead of per-node basis.

I've seen drastic (order of magnitude) reductions in RMS lock rates by adding RMS global buffers.

More info on locking and RMS global buffers may be found at http://www2.openvms.org/kparris/

Robert Gezelter · ‎05-07-2009

John,

Please pull back several steps to do the science. As John, Hoff, Hein, Jon, Jonathan, and I have noted there are a variety of potential causes for a high MPSYNCH, not to mention potentially excessive locking.

For example, at some released of OpenVMS TCP/IP, there were contention issues relating to IOLOCK8. The answer was to upgrade TCP/IP, not retool the applications.

If I may, the way to present research into this is that tuning, patching, and upgrading (hopefully covering all of my bases) are far less risky steps than major restructuring of the application.

Indeed, if the problem is fixable outside of the application, there is a very good chance that the problem will remain AFTER the application is re-tooled.

Theoretically, the problem could have many other potential causes.

My recommendation is to do the research into the performance data, and/or get assistance who has the expertise to dissect the various potential causes, and guide a plan of attack [Disclosure: My firm does provide services in this area, as does Hein, Hoff, and others].

In some cases I have dealt with, the correction for long running time has been depressingly mundane, and could be accomplished with orders of magnitude less work. If nothing else, eliminating the simply resolvable causes eliminates potential political repercussions.

- Bob Gezelter, http://www.rlgsc.com

John McL · ‎05-07-2009

Let me put this in perspective for you. I've come into an environment with a major application that's about 15 years old and has grown to over 3.5 million lines of Cobol and about 1 million lines of C. There's over 6000 source code modules (or include files) and over 650 executables. We have a large customer base, operate 24 * 7, and run a huge number of reporting jobs overnight in batch, in fact I think we drove the push to 7-digit entry numbers in the queue manager.

I'm looking at improving things across the board. Sorting out what looked like trivial locking (the most used of our own locks) is picking low-hanging fruit compared to the kinds of changes, perhaps even architectural, that are probably needed to (a) improve I/O throughput, (b) reduce the number of batch processes and (c) reduce the number of image activations. How much of (a), (b) and (c) we will do is still an open question.

The volume of files written during report generation involves a lot of file creation (quite check of some stats suggests 100,000 to the busiest disk) which of course brings its own set of locking issues within RMS and nasties with .DIR sizes. We're already starting to deal with that.

Our code is quite modular and some changes will be straightforward. (I was suprised to find that the lock I've referred to is specified in 6 places rather than just one.) Changes to write reports to other disks or in more efficient ways might not be so simple. We'll be looking at ways of improving efficiency, preferably using methods that cause minimum modification to existing code because it's not just the code changes per se that need to be done but a huge amount of subsequent testing.

There seems to be a perception among respondents to my question that any changes can be made quickly and easily. That's rarely the case with big software applications and large companies with at least some in-house software development, and it's certainly not the case here.

Be assured that other performance issues are being investigated but as I said, this locking issue is low-hanging fruit, and as the discussion in this thread has revealed, some improvements can be made.

Jon Pinkley · ‎05-07-2009

Reading John McL's response (May 7, 2009 04:00:45 GMT) it appears that he has found at least one cause for lock contention and excessive internode lock traffic. Specifically the use of a single lock resource to protect node specific resources. He now plans to include have a lock resource name for each node instead of a single global resource name.

A resource name per node should help for several reasons:

It allows the resource tree to be mastered on the node that has the object being protected. After the resource block is created, and as long as there is at least a single lock on the resource, subsequent locks on that resource that are requested from that node, will be handled locally, with no internode messages needed, not even a directory lookup to determine what node is mastering the resource. For this reason, it would be a good idea to have a process on the node take a NULL lock on the resource and hibernate forever. This will ensure that the resource (RSB) doesn't' get deleted, and will eliminate the internode traffic associated with the resource name, assuming all lock requests for the resource name are originating on the local node.

It will increase the granularity of locking (finer granularity), thus reduce contention for the resource names. In the global section example, there is no need to block access to a global section on nodeB when nodeA's global section is being modified. And this should be a relatively simple change, just adding the node name to the resource name.

It will eliminate activity based lock remastering of the resource trees, since all access will be local on each tree.

I would expect a big effect from that single change if internode locking of the resource was the cause of the problem. The only additional thing I would add is to have a process that keeps a NULL lock on the local resource on each node. That could be done with a "system" lock, but doing that requires privileged code and it is not supported. The cost of a dormant detached process is so small, that is what I would recommend, e.g. the same initialization routine that creates the global section should start a detached process running a program that takes a NULL lock on the resource name related to the global section and then a repeat forever hiber(). The loop is to protect against spurious wakes sent to the hibernating process.

In addition to the chapter 6 of the "VAXCluster Principles" book by Roy G. Davis, another good resource is Digital Technical Journal Number 5, September 1987 pages 29-44 "The VAX/VMS Distributed Lock Manager". This is an article describing the goals and the design of the DLM, and how it was designed to scale in a VMS Cluster. Unfortunately, I don't think this ever existed in postscript or pdf, although there may be scanned copies. It is one of the few "keepers" I have.

If you still have high MPSYNCH, and you determine it is lock manager related, and you have a large number of CPUs per node, then you should consider using a dedicated lock manager, as that is exactly the problem it was designed to address.

Jon

it depends

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Using LIB$BBSSI and BBCCI for locking

Using LIB$BBSSI and BBCCI for locking