PE1 values and identifying which lock tree remasters.

Thomas Ritter · ‎10-29-2006

We have lock tree of about 1.3 million locks. In our four node cluster this tree can sometimes move around in rapid succession from one node to another. In the worse case we incur RDB stalls. We are certain which application is causing the problem.
We have pegged PE1 to about 1,000,000 locks but then remote locking overhead causes other problem. We are starting to incur CPU saturation on the master node and the cluster begins to degrade.

We are directing the application people to correct their code. The problem is a combination of RDB global buffers and an application which keeps widening it search criteria when records are not found. Looking for few records among millions is not a good combination when RDB Global Buffering is enabled.

The question is: how can one use SDA to precisely identify the lock tree which moves around and ultimately link this to a database?

VMS 7.3-2 and RDB V7.1-441, four node DT cluster.

Jean-François Piéronne · ‎10-29-2006

Thomas,

a few years ago, we have had the same problem, even worse, closing the database crashed the system...
What we have done is to use a very large buffer size and large page size. This have drastically reduce the number of locks in the tree.

JF

Jean-François Piéronne · ‎10-29-2006

I forgot,

we have also open the database on only one node and started to use rows cache. The others nodes of the cluster doing remote access.

JF

Volker Halle · ‎10-29-2006

Thomas,

the LCK$SDA extension allows you to directly view the most active resource trees, including the name of the node, which is currently mastering the tree and the Resource Name information:

SDA> LCK SHOW ACTIVE

If you repeat this command from time to time, when the resource tree is moving around, you should be able to see the (master) node name changing.

On V7.3-2 there should also be SYS$EXAMPLES:RDB$SDA.C and .COM - these would allow you to build an RDB$SDA extension. It will allow you to display active DBs:

SDA> RDB SHOW ACTIVE_DB

If you really want to find out, when and how often these lock trees get remastered, you might be able to obtain this information using CNX tracing:

SDA> CNX LOAD
SDA> CNX START TRACE/FUNCTION=REMASTER
...
SDA> CNX STOP TRACE
SDA> CNX SHO TRACE
SDA> CNX UNLOAD

You might need to experiment with different /FUNC and /FAC parameters to obtain the desired information. The resource/lock names will not be shown, but will be in the trace buffers.

Volker.

Thomas Ritter · ‎10-30-2006

Jean, I like your idea of row caching and remote access. Do you have any numbers describing the task ? This is the first time I have heard that row caching and remote access might be do able. Row caching has been ignored as a solution because cluster wide access is required.

Volker Halle · ‎10-30-2006

Thomas,

I have played with SDA and CNX tracing a little bit and here is how you can actually trace resource trees moving (tested on V8.2):

$ ANAL/SYS
SDA> CNX LOAD
SDA> CNX START TRACE/FAC=LCK/FUNC=(RM_Req,RM_Complete)
...
SDA> CNX STOP TRACE
SDA> CNX SHOW TRACE/FULL

When done, use SDA> CNX UNLOAD to unload the trace code.

A node giving up a resource tree will send (Tx) a RM_Req message to the remote node. Once the rebuild has happened, the remote node will reply (Rx) with a RM_Complete message.

SDA> CNX SHOW TRACE/FULL will print the root resource name.

Volker.

Jean-François Piéronne · ‎10-30-2006

Thomas,

You may find some articles about rows cache from:
http://www.oracle.com/technology/products/rdb/index.html
or from metalink

The main drawback is that the database can be open on only one node. This why others nodes use remote access (Decnet or TCP/IP).

But access of Rdb data thru row cache may be much much faster than thru standard cache, you can expect an 90% cut of the lock activity if you can cache this way the most active data.

We have some programs which are 3 times faster when data area cache in rows cache instead in global buffers.
You will probably have to do some experiment
and carrefully check your RMU statistics and profile access (no exclusive transaction or no sequential access for example)

JF
JF

comarow · ‎10-30-2006

This is a bit of a side note, but are you using a dedicated lock management cpu?
In situations such as yours it can often help.

If you are getting a lot of lock tree remastering, especially with RDB, that is exactly the reason PE1 was written. Generally, you need to specifically decide where you want the mstering to take place, and set your lockdrwt accordingly. You seem to know the application.

You can't have that many huge databases.

There's no substitute for application design.

Incidently, also a tangent, regular file maintainence, will dramatically improve locking behave.

John Gillings · ‎11-15-2006

Thomas,
When (if?) you can get your systems to V8.2, please make sure you look at the new SYSGEN parameter LOCKRMWT. It can be used for much finer control of lock remastering between nodes than was available with PE1.

From the New Features manual

Lock Re-mastering Improvements
* Provides more control over lock re-master decision making with the new LOCKRMWT system parameter
* Remote activity thresholds necessary to move a tree are now computed based on local activity rates
* Provides greater control of application performance within an OpenVMS cluster
* Reduces the possibility of lock trees thrashing between nodes in an OpenVMS

A crucible of informative mistakes

Volker Halle · ‎11-15-2006

John,

When (if?) you can get your systems to V8.2, please make sure you look at the new SYSGEN parameter LOCKRMWT.

Slight correction: this parameter is only available with OpenVMS V8.3

Volker.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

PE1 values and identifying which lock tree remasters.

PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.

Re: PE1 values and identifying which lock tree remasters.