Operating System - OpenVMS
1828586 Members
7693 Online
109983 Solutions
New Discussion

Excessive number of processes in MUTEX

 
Lennart Breitholtz
Occasional Contributor

Excessive number of processes in MUTEX

We are running a Alphaserver ES47 with 16Gb and 4 CPUs. It is connected via
the SAN to an EVA5000. Currently we are running OpenVMS 7.3-2 with all
applicable patches. No DECNET, just TCPIP. We have an interactive login
limit of 820 users and a total of an odd 920 processes.

The main task of the machine is to run a billing application and a few
support systems. The billing application is connected to an Ingres DB
running on a True64 system via Ingres' Client - Server implementation.

Lately we have been experiencing performance problems. According to
MONITOR and autogen there are no resource shortage in the machine.
No queue to the CPU or the disks. The only "strange" behaviour we notice
is high degree of kernel mode. 110% kernel vs 64% user vs 144% idle e.g.
Also running (starting) images sometimes takes quite a long time.

We notice that we have a lot of processes in MWAIT, more specifically either
owning the LNM$AQ_MUTEX or waiting for it. Then LNM prefix seems to
indicate something logical names, which we have quite a lot of,
a "show logical/tab=* *" shows us around 75 000 logical names.

Can anyone give us a hint what we should look for? Could the LNM mutex be
something to pursue?

Regards
Lennart Breitholtz
6 REPLIES 6
Karl Rohwedder
Honored Contributor

Re: Excessive number of processes in MUTEX

Lennart,

you may use the LNM extension to SDA to check, what application translates which logical names and how often.

$ ANA/SYS
SDA> LNM ! gives you some help

regards Kalle

P.S. You don't run COBOL application, don't you? (COB$5644 logical)
Lennart Breitholtz
Occasional Contributor

Re: Excessive number of processes in MUTEX

Hi Karl.

Thank you for replying.

I do have a COBOL application running
on the node. What is (COB$5644 logical)?

Regards
Lennart Breitholtz
Heuser-Hofmann
Frequent Advisor

Re: Excessive number of processes in MUTEX

Karl Rohwedder
Honored Contributor

Re: Excessive number of processes in MUTEX

There were some debug lines left over, so that every subroutine call leads to a redefinition of said logical. In the meantime there are some COBRTL patches available, which seem to address this problem.
Before that, a call to HP could supply you with a new COBRTL.

regards Kalle
Hein van den Heuvel
Honored Contributor

Re: Excessive number of processes in MUTEX

The online (not live) version of that Bruden presentation is a little like yelling fire! It basically just says "If you are using COBOL then beware of COB$5644 and contact support".
Enough to get you worried, not enough details to take the worry away or substantiate it.

I'd say... don't worry if you did not notice a markes increase increase in Kernel mode and even MPsync or in the extreme case, possibly the mutex wait reported here.

The stronger 'proof' that this is affecting you could come from T4 or even good old MONITOR IO.... looking for the "Log Name Translation Rate". Idealy you have a historic reference for that, if not, you would like that to be a small multiple (10?) of file open rate. (I know... there are other reasons then opens to use logicals, but it is not unlikely those go hand in hand (like image activation).

The ultimate tell-tale would be using..
$ANALYZE/SYST
SDA> lnm load
LNM$DEBUG load status = 00000001
SDA> lnm start trace
Tracing started...
SDA> lnm stop trace
Tracing stopped...
SDA> lnm show trace
Logical Name Trace Information:
:


Karl, I this the problem is excessive lookups, not redefintion, but still!.

Slightly editted details from John R below.

Hope this helps someone, somewhere, somehow,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

" The problem is due to a fix in the COB$CALLED RTL routine.
In many places a COBOL "routine" needs to know if it was called if it was just entered from a fall-through from an earlier paragraph. The COB$CALLED RTL routine is a very small routine to return a YES/NO for the "am I called?" question. Turns out that for V8.2 (Alpha and I64), a bug was fixed in COB$CALLED. While the fix was correct, the engineer at the time put in a getenv("COB$5644") call so he could disable his fix dynamically for some last minute testing. The getenv() call was failed to be removed for the final checkin. So this previously small routine which can be called THOUSANDS of times for a COBOL application is now doing a logical name translation (THOUSANDS of time for certain colbol applications).

The ECOs are for Alpha V8.2 and V8.3; I64 V8.2, V8.2-1, and V8.3. I64 V8.3-1H1 contains the fix. Anything prior to V8.2 on Alpha does not have the bug. "
Heuser-Hofmann
Frequent Advisor

Re: Excessive number of processes in MUTEX

"The online (not live) version of that Bruden presentation is a little like yelling fire! It basically just says "If you are using COBOL then beware of COB$5644 and contact support".
Enough to get you worried, not enough details to take the worry away or substantiate it."

When Guy produced his nice presentation no patch was available via anon ftp.

Eberhard