Re: Using LIB$BBSSI and BBCCI for locking

Hein van den Heuvel · ‎05-06-2009

>> Hein and Jon, I don't think there's much point in responding to your questions

Correct. If you want only want to answer the questions you raise in the base topic, then there is absolute no reason to try answer our question.

What is the problem you are trying to solve?
a) learn about all about LIB$BBSSI
b) make the system better.

If the answer is b) then I believe our questions are rather pertinent. You may very well be correct that the low level interlocks are teh ultimate bottleneck. However, you have provided not a single morsel of pertinent data to support that notion.
It may be very obvious to you, but we don't have much to go on, trying to help you with 'the real problem'.

So kindly humor me and give me no points, but 2 lines of answer as to
- whether the real problem is with the large number of concurrent users as per original problem description, or those 6 hours batch job as per later.
- an indication of what those user/jobs might be doing. More towards real-time, in memory cell phone message routing and accounting, or perhpas more towards RMS/Oracle based OLTP-ish or reporting jobs?
In the first case I have good hope for your approach. In the latter case, I have no hope you will make a measureable difference.

Kindly explain a little more about:
"We have a 4-node Alpha cluster that during evening processing of batch jobs has its CPU's all running at about 100% for about 6 hours while processing batch jobs. The ENQ/DEQ rate for much of this time seems to average about 50,000 per second and MPSYNCH on one processor (or maybe one node, can't recall right now) is around 90%."

You indicate near 100% CPU use. What MODE?
All USER? EXEC? KERNEL?

Cheers,
Hein.

Keith Parris · ‎05-07-2009

First thing I'd do is see where the lock requests are coming from, to see if it's even your own locks that are generating the high rates.

SDA> LCK SHOW ACTIVE will list the most-active lock trees in descending order by lock rate, and you can gain clues as to what a lock is used for by its resource name. (VMS lock usage is described in an appendix of the Internals & Data Structures book.)

Christian Moser produced LCK SHOW ACTIVE after seeing the output from my LOCK_ACTV*.COM tool, available from the V6 Freeware CD directory [KP_LOCKTOOLS]. I prefer my tool, as it provides the same info but on a cluster-wide instead of per-node basis.

I've seen drastic (order of magnitude) reductions in RMS lock rates by adding RMS global buffers.

More info on locking and RMS global buffers may be found at http://www2.openvms.org/kparris/

Robert Gezelter · ‎05-07-2009

John,

Please pull back several steps to do the science. As John, Hoff, Hein, Jon, Jonathan, and I have noted there are a variety of potential causes for a high MPSYNCH, not to mention potentially excessive locking.

For example, at some released of OpenVMS TCP/IP, there were contention issues relating to IOLOCK8. The answer was to upgrade TCP/IP, not retool the applications.

If I may, the way to present research into this is that tuning, patching, and upgrading (hopefully covering all of my bases) are far less risky steps than major restructuring of the application.

Indeed, if the problem is fixable outside of the application, there is a very good chance that the problem will remain AFTER the application is re-tooled.

Theoretically, the problem could have many other potential causes.

My recommendation is to do the research into the performance data, and/or get assistance who has the expertise to dissect the various potential causes, and guide a plan of attack [Disclosure: My firm does provide services in this area, as does Hein, Hoff, and others].

In some cases I have dealt with, the correction for long running time has been depressingly mundane, and could be accomplished with orders of magnitude less work. If nothing else, eliminating the simply resolvable causes eliminates potential political repercussions.

- Bob Gezelter, http://www.rlgsc.com

John McL · ‎05-07-2009

Let me put this in perspective for you. I've come into an environment with a major application that's about 15 years old and has grown to over 3.5 million lines of Cobol and about 1 million lines of C. There's over 6000 source code modules (or include files) and over 650 executables. We have a large customer base, operate 24 * 7, and run a huge number of reporting jobs overnight in batch, in fact I think we drove the push to 7-digit entry numbers in the queue manager.

I'm looking at improving things across the board. Sorting out what looked like trivial locking (the most used of our own locks) is picking low-hanging fruit compared to the kinds of changes, perhaps even architectural, that are probably needed to (a) improve I/O throughput, (b) reduce the number of batch processes and (c) reduce the number of image activations. How much of (a), (b) and (c) we will do is still an open question.

The volume of files written during report generation involves a lot of file creation (quite check of some stats suggests 100,000 to the busiest disk) which of course brings its own set of locking issues within RMS and nasties with .DIR sizes. We're already starting to deal with that.

Our code is quite modular and some changes will be straightforward. (I was suprised to find that the lock I've referred to is specified in 6 places rather than just one.) Changes to write reports to other disks or in more efficient ways might not be so simple. We'll be looking at ways of improving efficiency, preferably using methods that cause minimum modification to existing code because it's not just the code changes per se that need to be done but a huge amount of subsequent testing.

There seems to be a perception among respondents to my question that any changes can be made quickly and easily. That's rarely the case with big software applications and large companies with at least some in-house software development, and it's certainly not the case here.

Be assured that other performance issues are being investigated but as I said, this locking issue is low-hanging fruit, and as the discussion in this thread has revealed, some improvements can be made.

Jon Pinkley · ‎05-07-2009

Reading John McL's response (May 7, 2009 04:00:45 GMT) it appears that he has found at least one cause for lock contention and excessive internode lock traffic. Specifically the use of a single lock resource to protect node specific resources. He now plans to include have a lock resource name for each node instead of a single global resource name.

A resource name per node should help for several reasons:

It allows the resource tree to be mastered on the node that has the object being protected. After the resource block is created, and as long as there is at least a single lock on the resource, subsequent locks on that resource that are requested from that node, will be handled locally, with no internode messages needed, not even a directory lookup to determine what node is mastering the resource. For this reason, it would be a good idea to have a process on the node take a NULL lock on the resource and hibernate forever. This will ensure that the resource (RSB) doesn't' get deleted, and will eliminate the internode traffic associated with the resource name, assuming all lock requests for the resource name are originating on the local node.

It will increase the granularity of locking (finer granularity), thus reduce contention for the resource names. In the global section example, there is no need to block access to a global section on nodeB when nodeA's global section is being modified. And this should be a relatively simple change, just adding the node name to the resource name.

It will eliminate activity based lock remastering of the resource trees, since all access will be local on each tree.

I would expect a big effect from that single change if internode locking of the resource was the cause of the problem. The only additional thing I would add is to have a process that keeps a NULL lock on the local resource on each node. That could be done with a "system" lock, but doing that requires privileged code and it is not supported. The cost of a dormant detached process is so small, that is what I would recommend, e.g. the same initialization routine that creates the global section should start a detached process running a program that takes a NULL lock on the resource name related to the global section and then a repeat forever hiber(). The loop is to protect against spurious wakes sent to the hibernating process.

In addition to the chapter 6 of the "VAXCluster Principles" book by Roy G. Davis, another good resource is Digital Technical Journal Number 5, September 1987 pages 29-44 "The VAX/VMS Distributed Lock Manager". This is an article describing the goals and the design of the DLM, and how it was designed to scale in a VMS Cluster. Unfortunately, I don't think this ever existed in postscript or pdf, although there may be scanned copies. It is one of the few "keepers" I have.

If you still have high MPSYNCH, and you determine it is lock manager related, and you have a large number of CPUs per node, then you should consider using a dedicated lock manager, as that is exactly the problem it was designed to address.

Jon

it depends

Robert Gezelter · ‎05-07-2009

John,

I actually meant the opposite (e.g., code changes would be more complex than anticipated).

It may well be mentioning the obvious, but the changes on the "straightforward" list often have significant impact at low risk and even lower effort.

I presume that disk shadowing is in use. The version of OpenVMS has not been mentioned, but (I've lost track of mu notes as to which version of OpenVMS Shadowing supports/will support) adding a RAM disk as a member of a shadow set), but either a RAM disk member or a attached RAM disk (external zero latency storage device) may help.

Also, are all images pre-installed? This can be a significant simplifier of activations.

On the RMS side, what are the default extensions of the files that are being created? Also, do the files need to be created in the same directory, or only accessible via the same logical name? Directory scattering (e.g., such as is done by MX and others) is a useful trick to spread diectory activity around.

I am sure that some of the above have occurred to you. Certainly, one of the limitations of a public forum is the amount of information available to responders. Some of the comments are also intended to provide information for others who follow this thread, now and in the future.

- Bob Gezelter, http://www.rlgsc.com

John McL · ‎05-07-2009

Hi Bob,

We're on VMS 8.3, we use shadowing, we use RAM disk for some things, all images are installed (but not resident). I'll shortly be floating ideas about merging certain images that create reports or how batch processes might do more work rather than exist only briefly. We also need to review whether some of our major report generating programs can be made to run faster.

While I'll been responding here I've also been making changes to the code we run at the end of each process so that it doesn't write thousands of individual files per day but composite files, one per node per hour.

There's a lot we might do and investigations are at an early stage. Our target is to reduce the overheads of process creation, image activation and to improve IO performance. Addressing the most heavily used of our own locks was just a small part of a much bigger picture.

Robert Gezelter · ‎05-08-2009

John,

As always, the context become clearer as more details emerge.

The devil is always in the details. As noted previously, public fora (e.g., ITRC) are very useful, but not a substitute for actual consulting expertise.

In terms of basic science, I would not presume that the MPSYNCH et al are the result of the application's internal locks. Even if they are locks on behalf of the application, they may not be THOSE locks.

Consider the benefits of INSTALLing more than just the file. Caution is advised, but the payoffs can be VERY large if an image is activated many many times. Ditto, reorganizing applications images so that commonly used sub-components are used from permanently loaded, shareable, resident libraries (this eliminates the work of relocating the user infrastructure on each initiation; it adds up).

"Thousands of files per day" is not a lot. My email gateway (an MicroVAX 3100/38) easily processes many more emails per day, creating three files per message without a problem. If the file creation is a problem, it may be a symptom, not a cause. One possibility that often gets overlooked is directory extension. Directories must be contiguous, and thus if they are expanding relatively rapidly, cause significant overhead as they are re-allocated and migrated. They can be pre-allocated, which resolves that problem. There are also other RMS and file system optimizations that can be done.

Also consider whether some optimizations can be hurting. If a disk is in effect an output-only device, it may pay to disable caching, as there may be better uses for the memory (the file system caches and RMS buffering scheme will prevent many unneeded writes and reads).

I hope that the above is helpful.

- Bob Gezelter, http://www.rlgsc.com

Hoff · ‎05-08-2009

No one here is suggesting blindly making changes.

Clustering may well be causing you some problems; the sharing of disks and resources isn't a panacea. There are cases where partitioning can help here, whether with the locks and lock resource names, or with adding and segmenting disk spindles or file sharing.

Investigate the whole environment. Baseline it, too. This is a systemic process, and probably one underway.

If you think bitlocks are the way to go here, then you've already prepared the baseline (for comparison with the results) and the instrumentation and go for it and make the changes.

Personally, I'd be looking at the whole environment, as part of figuring where I wanted to end up. Sometimes the (actual) low-hanging fruit in a big application can be quite surprising. Sometimes tossing faster boxes (the going prices of used AlphaServer ES45 boxes? around US$500 rx2600? around US$300) can be reasonable, while in other cases finding and removing a critical resource bottleneck can have a disproportionate payback.

That's a decent-sized source pool you're working with. Not the biggest around, but big enough to make this project and this review a rather larger project. And of the scale where tools such as DECset PCA and such can and will help; there's no DTrace and no Instruments around, though.

John McL · ‎05-10-2009

I am fed up with the attitude of people responding to this thread. I do not expect to have to write "War and Peace" to describe our environment, nor to tell more about how we use our systems than I would like to, when I ask a question to this forum.

Looking back I can't see that anyone bothered to ask whether this was part of a tuning exercise or whether locking was just being reviewed. Try it next time because it makes for a more civil discussion.

(In fairness I note that this is the first time I've experienced such haranguing from this forum. I do however hope it's the last.)

Points will be assigned according to whether you stuck to the subject or tried to push the thread into a different area, or demanded answers to questions that you didn't show were relevant.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Using LIB$BBSSI and BBCCI for locking