Operating System - OpenVMS
1752440 Members
5862 Online
108788 Solutions
New Discussion юеВ

Re: Way to identify which process has a lock granted in EXclusive mode?

 
Mark Corcoran
Frequent Advisor

Way to identify which process has a lock granted in EXclusive mode?

Hi, I'm trying to produce a workaround for a situation whereby a process "forgets" to unlock a log file that many other processes also use (other processes then hang whilst waiting to write to the log file - this relates to my other posting about WCBs).

Essentially, the workaround, is to identify a process which has got the log file open, and using various criteria, determine that the file is not legitimately open, and $FORCEX the offending process.

I can PIPE the output of SHOW DEVICE /FILES, to search for the log file in question.

Although there aren't generally a large number of files open on the volume, this seemed to me to be an unnecessarily laborious and possibly I/O intensive method of finding the offending process.

THis is because the process - along with (supposedly) all others - co-operates on access to the file by using a named lock resource, which they all (again, supposedly) $ENQW before attempting to open the file, write to it, close it, and then $DEQ the lock resource afterwards.

I had thought it would be a simple case of using SDA to do SHOW LOCK /NAME=resource_name (admittedly, it would still require the output to be post-processed to extract the PID).

However, it seems that some processes are using $DEQ to convert the lock to a NULL-mode lock.

Consequently, I can get any number of copies of the lock reported, before I actually find the one that says "Granted at EX" rather than "Granted at NL".

Whilst SDA allows a /STATUS on the SHOW LOCK command, all (or virtually all) of the lock copies have exactly the same bits set in the status.

What would be ideal, is to specify /FLAGS= - this would allow me to specify /NOFLAGS=CONVERT (or /FLAGS=NOCONVERT if you prefer), but unfortunately, SDA doesn't appear to have this option unless anyone knows a suitable way of restricting the search criteria that's not obvious from HELP??


[I was loathe to process the output of all known copies of locks with the same resource name, because I did notice a sizeable pause at some point in the output.

I'm guessing that this is because the SHOW LOCK /NAME= command is effectively the equivalent of having an SQL database, with an unindexed table, and doing SELECT * FROM table WHERE RESOURCENAME=lock_being_sought

...i.e. it has to traverse the entire table (or in this case, the VMS lock database) to find locks with matching resource names, and that the database is rather full of named lock resources.

[To be honest, I'm not even sure how to find out how many named resources there are in the lock database - SDA> SHOW LOCK /SUMM gives me plenty of figures, but the scant details in the SDA manual don't really help interpret it.

I'm not even sure in hindsight how I would determine the PID from the information that SHOW LOCK /NAME= returns - maybe I /am/ better off sticking with SHOW DEV /FILES?]]

Mark
21 REPLIES 21
Hein van den Heuvel
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

Just stick with SHOW DEV/FILES.
It is crude, brute force, but effective.

The only catch would be if the waiters are waiting for the application lock, not the file lock. Because then the file _might_ be cloase, but the application lock still held.

An other not-too-hard-to-code solution could be calling SYS$GETLKI (get lucky) in a loop, Unfortunately it does not have a selection criteria. You'll have to look at each lock returned.

You may need to call GETLKI in KERNEL mode, if you need to learn about the FILE LOCK.

Also... do you need to worry about the lock being held on a different clsuter member?

Hein.

Mark Corcoran
Frequent Advisor

Re: Way to identify which process has a lock granted in EXclusive mode?

>The only catch would be if the waiters are waiting for the application lock, not the file lock. Because then the file _might_ be cloase, but the application lock still held.

As far as I've been made aware, the code should be using $ENQW everywhere, so it should be relying on the application lock, rather than a badly behaved app locking/unlocking the file without reference to the application lock.


>Also... do you need to worry about the lock being held on a different clsuter member?
Apparently not.

I've never really had to look too deeply into locks before, so I would guess that cluster-wide locking is possible, but I'm not sure how this works.

Whilst things work, I don't need to get my hands dirty and increase my knowledge (don't have the time, for one thing).

Only when it goes wrong, and I'm determined to get to the root of the problem, do I start learning more than would otherwsie be necessary.

Depending on what the developers have found about this potential bug, it might be possible to identify the circumstances under which it occurs, and hence make the workaround a lot simpler.

Fingers Xed!
Jess Goodman
Esteemed Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

Unless you are running on IA64 you could just use Edward Heinrich's FILES_INFO program, available at
http://www.tmk.com/ftp/vms-freeware/fileserv/files_info.zip

Just pass it the name of the log file, and you will get:

FILE: _$1$DGA101:[LOG]DUMMY.LOG;1
Total access count of 1, XQP access 1, writers 1, size 10/147

PID USERNAME READS WRITES ACCESS CHARACTERISTICS
-------- ------------ -------- -------- ----------------------
77B54CC7 SYSTEM 0 12 Write, Sequential, NoWriteShr

You must run it on every node where the log file might be open.
I have one, but it's personal.
Hoff
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

DECamds (and its follow-on Availability Manager tool) have this mechanism, and it's intuitive and trivial to use.

Once you go through the somewhat unintuitive installation and configuration process, that is.
John Gillings
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

Mark,

I'm a bit confused. Is the blocking lock your application lock, or an RMS lock? If RMS is it a record lock or the whole file?

The application lock case should be fairly simple, just $ENQ yourself against the lock, then $GETLKI to find the blocking lock. For RMS, you may be able to do something similar with a ROP=WAT option, perhaps even from DCL with READ/WAIT?

If you can go back a step to the application design and work on the locking mechanism, perhaps implement a blocking AST? Maybe some kind of timeout, if the BLAST fires after you've held the lock to "too long", just kill the process?
A crucible of informative mistakes
David B Sneddon
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

Mark,

If it is RMS file/record locks then use google
to find "DBS-SCANLOCKS" which I have used for a
while now to locate processes that are locking
files...

Dave
Mark Corcoran
Frequent Advisor

Re: Way to identify which process has a lock granted in EXclusive mode?

Thanks others, for their suggestions regarding software which will help analyse this; unfortunately, Sarbanes-Oxley controls would initially prevent these being installed (and I need to go to another team to get approval on it anyway).

John, in answer to your question:

>I'm a bit confused. Is the blocking lock your application lock, or an RMS lock? If RMS is it a record lock or the whole file?

It's an application lock. Let's for argument's sake call the resource name MIRROR_ON_THE_WALL.

One process $ENQWs a lock request for exclusive mode for this resource, and (for whatever reason - maybe it is stuck looping around doing nothing, waiting for something that will never happen, bug in the code where there's no call to $DEQ, or there is a call to it but under some circumstances the logic path avoids this bit of code) never releases it.

Each process supposedly does a $ENQW for MIRROR_ON_THE_WALL, then when the lock is granted, calls LIB$GET_LUN, Fortran OPEN, Fortran WRITE, Fortran CLOSE, LIB$FREE_LUN and $DEQ.

I would presume there will be RMS locks associated with the Fortran OPEN, but the code in hanging processes doesn't (read: shouldn't) get that far because it is still waiting on the $ENQW for MIRROR_ON_THE_WALL.

[The developers have indicated that there is a generic function that does this (it is an error handler), although looking through the CMS libraries, there seems to be umpteen copies of the handler in different modules.

I'm not sure whether or not they are all still in use and behave exactly the same.

Therein lies the problem in copy useful code into different modules/projects, rather than sticking it in one place...]



>The application lock case should be fairly simple, just $ENQ yourself against the lock, then $GETLKI to find the blocking lock.

As I mentioned, I haven't previously had to look at locking at this kind of level, but I have been doing development on VMS for almost 20yrs, so this won't be a problem once I've read the details of these 2 particular system services (I have the manuals in hard and soft copy form, don't worry!).

In all honesty, any solution I implement as an automatic "workaround" will require it to go thru Sarbanes-Oxley audit controls, but developing it myself might be easier than the pain of getting another team to approve third party (to the company's point of view, rather than to that team) software first of all.


>For RMS, you may be able to do something similar with a ROP=WAT option, perhaps even from DCL with READ/WAIT?
I've not seen the /WAIT qualifier for READ before, and it's not listed in the help library on our system. Is this only available from a particular version?


>If you can go back a step to the application design and work on the locking mechanism, perhaps implement a blocking AST?

Alas, development of the application was outsourced a long time ago, and ├В┬г├В┬г├В┬г├В┬г is payable for any change, which may take a long time to be delivered (I'm not sure that the developers in the outsource company are hardcore VMSers; probably quite adept at the various languages that the application is written in, but it's not a place one would tend to associate with VMS systems to have worked on for years before winning an outsourcing contract).

They have given more of an update on the bug that they thought they've found; they have seen in cause "a problem", but not the same problem as we are seeing.

Thinking about it from their description (a new version of the log file is created, rather than the existing one being appended to), it sounds to me like some processes are actually not using the application lock at all, and thus:

a) create a new version of the file if the file is already locked open by an offending process
b) hang on getting access to the existing file if it is already locked open.


Mark
David B Sneddon
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

Mark,

In the DBS-SCANLOCKS package there is a program
called GETLKI, it is fairly small and there
would be very little involved if you were to
look at that code then "develop" your own
version...

Dave
Ian Miller.
Honored Contributor

Re: Way to identify which process has a lock granted in EXclusive mode?

For many reasons Getting Availability Manager/AMDS installed would be good idea so you should start the process to gain approval.

Download from
http://h71000.www7.hp.com/openvms/products/availman/index.html

or you will find it on the VMS CDs.
____________________
Purely Personal Opinion