Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

ANA/SYS show lock/sum

SOLVED
Go to solution
Wim Van den Wyngaert
Honored Contributor

ANA/SYS show lock/sum

Shows "no quota for operation" and "proposed new manager declined". On the subject of remastering.
On my cluster they have substantial values. What do they indicate ?

Wim
Wim
21 REPLIES
John Gillings
Honored Contributor

Re: ANA/SYS show lock/sum

Wim,
could you post more detail? Perhaps a cut and paste of the actual output?
A crucible of informative mistakes
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Of course John. here is the complete output.
It's on a GS160 with a double cpu.

SDA> show lock/sum

Lock Manager Summary Information:
---------------------------------
Lock Manager Flags:
Mode (LCKMGR_MODE) is automatic
Dedicated Lock Manager is disabled

Lock Manager Poolzone:
Poolzone Region Address: FFFFFFFF.82121D80
Packet Size: 00000100 (256.)
Number of Pages: 0000031A (794.)
Maximum Number of Pages: 00030C18 (199704.)
Free Page Count: 00000E2D (3629.)
Hits: 1F179CF1 (521641201.)
Misses: 00000A8F (2703.)
Poolzone Expansions: 00000D91 (3473.)
Allocation Failures: 00000000 (0.)
Allocation not from 1st Page: 00000000 (0.)
Empty Pages: 00000000 (0.)

Lock Manager per-CPU Performance Counters:
------------------------------------------
Counters \ CPU Id 0 1 Total
------------------------- ---------------------- ------------
LCKRQ Cache 0 0 0
LKB delete pending Cache 0 0 0
RSB delete pending Cache 0 0 0
LKB Cache 88 133 221
RSB Cache 168 31 199
LKB Allocations (cache) 1915358602 1882315377 3797673979
RSB Allocations (cache) 1913942197 1879929018 3793871215
New Lock Requests (local) 1395072696 1851452214 3246524910
New Lock Requests (in) 347138952 177934 347316886
New Lock Requests (out) 204537089 353679 204890768
Conversion Requests (loc) 1409289697 1074326271 2483615968
Conversion Requests (in) 171158606 120790 171279396
Conversion Requests (out) 122445710 49424582 171870292
Dequeue Requests (local) 1487526856 1751015839 3238542695
Dequeue Requests (in) 335314266 120338 335434604
Dequeue Requests (out) 64356609 144203032 208559641
$ENQ Requests that Wait 99824189 31134302 130958491

Lock Manager per-CPU Performance Counters:
------------------------------------------
$ENQ Requests not Queued 42218212 14093770 56311982
Blocking ASTs (local) 19126467 4552927 23679394
Blocking ASTs (in) 42419234 28399 42447633
Blocking ASTs (out) 12284461 11014835 23299296
Directory Functions (in) 565661475 290018 565951493
Directory Functions (out) 904697639 339876935 1244574574

Lock Manager Performance Counters:
----------------------------------
Deadlock Counters:
Deadlock Searches 271
Deadlock Found 0
Deadlock Messages (in) 34
Deadlock Messages (out) 14

Lock Remaster Counters:
Tree moved to this node 1807521
Tree moved to another node 1953666
Tree moved due to higher Activity 1953666
Tree moved due to higher LOCKDIRWT 0
Tree moved due to Single Node Locks 2191087
No Quota for Operation 342123
Proposed New Manager Declined 56504
Operations completed 3704671
Remaster Messages Sent 9452665
Remaster Messages Received 9185923
Remaster Rebuild Messages Sent 185382
Remaster Rebuild Messages Received 381606

Lock Manager Performance Counters:
----------------------------------

2-Phase Commit Counters:
Requests Sent 362830
Requests Received 1070093
Ready Messages Sent 1062465
Ready Messages Received 357783
ACK Messages Sent 357783
ACK Messages Received 1062465
Cancel Messages Sent 0
Cancel Messages Received 0
SDA>

Wim
Volker Halle
Honored Contributor
Solution

Re: ANA/SYS show lock/sum

Wim,

PMS$GL_RM_QUOTA_WAIT indicates the no. of times, that no remastering quota was available on the local node (CLUB$L_RM_QUOTA in CLUB). The default quota is 5. This seems to limit the remastering operations currently in progress on the local node.

PMS$GL_RM_REQ_NAK counts the no. of times the remote node (new master) declined to accept a remastering request for a resource tree (e.g. resource not found, shutdown in progress)

Volker.
Hein van den Heuvel
Honored Contributor

Re: ANA/SYS show lock/sum

That seems like an awful lot of lock remasters. Let's focus on that first and then seen if there are still enough 'no quota' and 'declined' messages left over to worry about. Is this lock re-mastering rate as you expect / by design, or it is 'just happening'. Over how much time where those stats? Is this an application balanced over a cluster? Any way to skew certain file / db accessed to certain member. It must be a somewhat recent VMS version to run on the GS160, but maybe a later one improved the lock remastering algoritme some? It is time to play with LOCKDIRWAIT ?

Cheers,
Hein.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

The node was up for about 200 days when this data was captured.
I tried to find out which tree was moving (with the script I posted, even without the wait) but the only one I found was rightslist.dat. All other remasterings are not visible. That's why I hope HP will create a show remastering/int= command.

I think the remastering must have something to do with DSM/MUMPS, that is running in cluster mode. Another cluster with only Sybase servers has a lot less remasterings.

Wim
Wim
Hein van den Heuvel
Honored Contributor

Re: ANA/SYS show lock/sum


>> I tried to find out which tree was moving (with the script I posted, even without the wait) but the only one I found was rightslist.dat.


Ah, I find that very intesting and can't help wondering how much resources were wated with that. I calling is wasted because this file will be 99.99% read only for most customers. (Of course the system Wim was focussing on would be the exception to the rule :-). I wonder whether the (relatively) new concurrent read locks usage by RMS when using global buffers would influence this.
Is this systems using global buffers for rightslist? I fnd that advisable for many site irrespective of the CR locking.

Or how about... (am I really saying this?...) creating a per-node copy of righstlist and promiss to put a fresh one in place (convert/share) after any/every update?
Yeah I know... ugly, but let's just say we'd do this for educational / analysis purposes and not for sustained usage in production?

Groets,
Hein.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Hein,

I don't think the rightslist is the bottleneck. It moves maximum every 8 seconds (average not even every minute) while I'm seeing remaster counters increase with 50 per minute. I also see lots of packages being exchanged (about 300 per minute) while rightslist lock list size should be rather small.

Wim
Wim
Cass Witkowski
Trusted Contributor

Re: ANA/SYS show lock/sum

What is the value of the SYSGEN parameter PE1?

We have set it to -1 to turn off lock remastering with DSM. This is a dynamic parameter.

You may want to run the routine LKMSTA in your DSM environment and look at Lockman writes. This show contention between nodes in the cluster trying to write to the same DSM data block. Generally when a routine is updating data in DSM they lock that node. So if you have this happening often across the nodes in the cluster you may see the locks being remastered back and forth.

You can either mount that DSM volume local to a node and use DDP to update that volume from the other nodes or rewrite the application to reduce the contention.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Cass,

I did a monitoring during a busy interval of 45 minutes. The monitor interval in DSM was 60 seconds.

The average lockman writes was 0.74 with a maximum of 2.2.

The total deq as wel as enq was (each) about 27. The maximum was about 270 each !

I got reported 2000 tree moves (exactly 0.74 per sec, just like lockman writes) which required about 10000 packets to be exchanged (so 5 packets per move).

Conclusion : each lockman write results in a tree move ???

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Lockman Writes

The number of times that the Write Demon, in response to contention for a block held at exclusive write access and subsequently modified in cache, needed to physically
write a block to disk before releasing access to it.

Could it be that DSM uses the lock tree move mechanism for doing this (each lockman writes requires a lock tree move) ?

Wim
Wim
Volker Halle
Honored Contributor

Re: ANA/SYS show lock/sum

Wim,

no. I don't think that the application is aware of lock remastering, it's happening 'behind the scenes'.

But each lockman action may just produce enough lock activity in that resource tree to cause it to be moved ?!

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Volker,

Could be but it would be a miracle that each time the activity is on another node.

Wim
Wim
Volker Halle
Honored Contributor

Re: ANA/SYS show lock/sum

Wim,

I don't know the DSM internals, but wouldn't the description given by Cass exactly lead to this kind of behaviour ?

If one node has the data block cached, the other node has to do something (with locking) to cause it to be written to disk, to then read it itself. Then this node would have the block in cache and the other node would have to do something to obtain the data block...

Something similar like this can be happening with RMS global buffers, if the you are writing a lot to the same file from multiple nodes.

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

"before releasing it". May be the tree simply moves because node A is wanting a lock and B releases the lock. Then a remaster "sole interest" is done.

Will do more monitoring ...

Wim
Wim
Cass Witkowski
Trusted Contributor

Re: ANA/SYS show lock/sum

DSM does not know about lock remastering. It does use OpenVMS locking for it's locking.

I would look at either setting the SYSGEN parameter PE1 to -1 to turn off lock remastering or locally mounting that DSM volume set.

You can use the ANASYS routines in DSM to show the cache contention.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Cass,

Thanks for helping me on the right track.
The lock isn't visible because via my scripts because the lock tree only existed for a fraction of a second.

But if I set PE1, it could harm other stuff on the cluster so I will accept the tree move in case of contention. It is by average 0.74 but this average is caused by peaks.
Local mounting is not an option because of high availability.

Wim
Wim
Cass Witkowski
Trusted Contributor

Re: ANA/SYS show lock/sum

Wim

We have pe1 set to -1 and over 100 sites. It is also a dynamic parameter so you can change it and change it back.

Cass
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Cass,

There could be big lock trees that NEED to move because the activity is on the other node. If the lock tree is not moved, a lot of overhead could be created. I have seen an average enq rate of 8000. If the tree isn't moved this could lead to a serious slow down.

Just need more commands to analyze what is happening. Right now a log of every remaster would be great.

Wim
Wim
Jan van den Ende
Honored Contributor

Re: ANA/SYS show lock/sum

Wim,


I have seen an average enq rate of 8000.


One of our applications has a number of application manager functionalities that run for 10 - 25 minutes, during which END/DEQ rate is between 100K - 250K.

Its database is a collection of multi(4-11) -keyed RMS files (totalling ca 3 G).
Still, we have no complaints about performance of other users, not even users of that same app.

fwiw,

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.
Wim Van den Wyngaert
Honored Contributor

Re: ANA/SYS show lock/sum

Jan,

But would you put PE1 on e.g. 1 on your system ?

Wim
Wim
Jan van den Ende
Honored Contributor

Re: ANA/SYS show lock/sum

Wim,

No.

No problems seen, nor seemingly approaching => not the least reason to even consider tampering.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.