Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

DLM & RMS Multistreaming

Ruslan R. Laishev
Super Advisor

DLM & RMS Multistreaming

Hi there!

When file is opening in the multistream mode RMS then using DLM to coordinate an access to global buffer area.

Is there a hope that this "feature" will be fixed in a next RMS ECO ?

Thanks.
13 REPLIES
Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming

I'm sorry, but I'm having a hard time understanding your question, and I suspect it is even harder for other readers. What is the problem?

I can make a few general remarks though in this area tghat might just help.

The RMS global buffer code expects and thus requires bucket locking to be active.

RMS locking is turned on if there is any type of shared write access (FAB$B_SHR = SHRPUT or SHRUPD or SHRDEL) or if there is write access allowing others readers (FAB$B_FAC = PUT or UPD or DEL or TRN, and FAB$B_SHR = SHRGET) or if MSE is used.

Should RMS locking not normally be used when accessing, say a read-only file, then RMS locking can always be turned on by specifying MSE access (FAB$B_SHR = SHRGET & MSE).

The reason RMS ignores a request for global buffers if readonly access is requested is that 20+ years ago it was believed that the locking overhead would be worse than the caching win. Personally I have never been convinced of this, and with the VMS 7.2+ improvements in the global buffer code this is less likely to be true.

While VMS Support and engineering have been reqested to explain this every now and then, I don't believe they ever received a specific request to 'fix it'. Do NOT hold your breath for this to change. Not in the next ECO, not ever.

I hope this helps some. If not, please articulate more clearly what you believe RMS should fix, and more importantly please explain what problem you really are trying to solve.

If you believe something should change in RMS, then please escalate through the official support channels, witha clear business justification.

Good Luck,
Hein. (RMS Engineering 1991 - 1994)
Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

Hello!

Thanks, Hein, for the reading my English-script.

For demonstration of the problem:

http://starlet.deltatelecom.ru/~laishev/lck.c
http://starlet.deltatelecom.ru/~laishev/lck1.c

I hope that my C is better than my English.

A high rate cluster locks when MSE is using was escalated to the DEC support some time ago. Follows an aswer for DEC-guy:

-------------------------
I may have discovered the cause of the performance issue you are seeing. I have certainly discovered the cause of the
high Distributed Lock (DLOCK) rates. If the performance problem is due to the high DLOCK rates, then we're home free.

The problem is in using the FAB$V_MSE option in the FAB$B_SHR field. I shared the code you provided with engineering,
and one of the engineers was gracious enough to point out the following sentence in the documentation for the MSE option.

>>>>>>>>>>>>>>>>>>>>>>>>>>>
...Selecting the FAB$V_MSE options turns on locking to coordinate access to buffers.
<<<<<<<<<<<<<<<<<<<<<<<<<<<

I modifed the code you sent to declare the FAB in the child thread, as well as do the open there, and did NOT set the
MSE option. This allowed me to run your program without seeing any growth at all in the distributed locking rates.
-------------------------
Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming

Right, so this is pretty much in line with my earlier reply. Read it again carefully.

The MSE option allows RMS to share buffer between multiple streams in a single process. Of course, if you share data structures between streams, then that access must be coordinated.
RMS does NOT have, and will not have, a low-cost locking/sharing option. It either uses no locking, or goes all out with the standard, cluster aware, VMS lock manager... onlyu using EXCLUSIVE locks for older VMS versions.
You would then see DLOCK if the lock tree owner happened to start on the wrong node. Possibly you can battle that by allowing/forcing lock migration or carefull per-node access planning.

The only change RMS made in thise space in recent versions is to start using CONCRURRENT READ locks under certain circumstances. I'll try to re-reply here with details such as file organizations and VMS versions/patch levels for this.

In the classic read-only case it is very well possible that it is more effcient to re-read the same data in multiple, per-stream, program buffers than to attempt to share them. The VMS VCC or XFC, or even the controller/disk cache will likely make the IOs quick. Yes RMS could perhaps be more smart about this, but it is not and considering the current development plans / maturity thise will not change.

Of course if you can show VMS Engineering how it can make multi-million dollar future sales if this were to be implemented... but there is no point discussing this in this forum here, not with support.

Is the target of those ZZZ000 files an indexed file or a simple sequential file? If it is just a sequential file, and you want to speed up access, the knobs to turn are:
RAH = 1, MBC = 64 (127?), MBF = 4 (8?)
This may well give a 3x performance boost.
(The default is RAH=0,MBC=16,MBF=1)

hth,
Hein.
Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

[quote]
Is the target of those ZZZ000 files an indexed file or a simple sequential file?
[/quote]
Sequential file.

[quote]
If it is just a sequential file, and you want to speed up access, the knobs to turn are:
RAH = 1, MBC = 64 (127?), MBF = 4 (8?)
This may well give a 3x performance boost.
(The default is RAH=0,MBC=16,MBF=1)
[/quote]
Is these things will remove high lock rate problem? Please, don't forget that a primary problem of slow file access speed - cluster locks.

I have an application where single file reading by 30-60 threads, and every threads reading file from start to end during processing a request. A request rate is 20-30 per second.
Your advice will cause a quota increasing much more then in the global buffers case. :(
Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

[quote]
The MSE option allows RMS to share buffer between multiple streams in a single process. Of course, if you share data structures between streams, then that access must be coordinated.
[/quote]
Yes.

[quote]
RMS does NOT have, and will not have, a low-cost locking/sharing option.
[/quote]
WHY ? Hein, what about of mutes-es, semaphores and others low-cost IPC-s?
Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming

> Is these things will remove high lock rate problem?
> Please, don't forget that a primary problem of slow file access speed - cluster locks.

How can I forget? I never even knew!
You never told us about the real problem nor the real target! We finally learned is was an sequential file. We still have to guess whether it is 50kb, 50mb or 50gb in size.
We still have to guess whether you can force access from one node (no dlock needed) or whehter you must be ready with concurrent access from more node. I believe we know the application itself only reads, and does not allow for shared write.

> I have an application where single file reading by 30-60 threads,
> and every threads reading file from start to end during processing a request.

And you think RMS is not smart enough huh?
Maybe, just maybe, this application can do something more smart?

> A request rate is 20-30 per second.

Sorry, again a translation problem.
Are you talking locks/second? io/sec? file scans/sec?

> Your advice will cause a quota increasing much more then in the global buffers case. :(

And is there a 'quota problem'? we have yet to learn about that. Yes, the global buffers have the potential to offer full file sharing with just a single copy in memory... if you can hold the whole file in memory. You never gave an indication as to how big the file was, or how much/little memory is available.
My suggestion is based on the expectation that you can not cache the whole file and I am just making the reading go faster. By NOT having MSE, and NOT having global buffers, and using multipe large local buffers with read ahead it will read optimal... at the price of a set of buffers per stream. Is that a problem?

Once you disabled record locking, as I believe you did with NQL + RRL + NLK, I also happen to believe that large buffers might re-allow you to try GBC + MSE because you will have only bucket locks and fewer of them! The default is 16 block/buffer and you can make it 127 blocks, thus reducing your lock rate by a factor of almost 8. As I said, I need to check back whether concurrent read has become available for sequential files in global buffers, but I doubt it.


Why this was not fixed? Because there was no clear outspoken customer demand for it and there were plenty of other things to do!
The application you decsribe is NOT typical.
I believe you would be MUCH better of to investigate using CRMPSC to MAP the whole file in memory accessible by all threads. Just skip RMS completely, avoid all its locks, and avoid all the overhead of diving into exec mode, probing user argument, probing user buffer, manipulating AST states and so on and so forth.
Just have a single application synchronization to indicated how far the data is valid.

Good luck,
Hein.



Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

>How can I forget? I never even knew!
Sorry. You are right.

>You never told us about the real problem nor >the real target! We finally learned is was an >sequential file. We still have to guess >whether it is 50kb, 50mb or 50gb in size.
The file is about 384 blocks.

>We still have to guess whether you can force >access from one node (no dlock needed) or >whehter you must be ready with concurrent >access from more node.
... from several cluster members.

> I believe we know the application itself only >reads, and does not allow for shared write.
Only read.

> I have an application where single file reading by 30-60 threads,
> and every threads reading file from start to end during processing a request.

>And you think RMS is not smart enough huh?
I don't think. I see that.

>Maybe, just maybe, this application can do >something more smart?
Yes, If I'll knock my clients/customers from plain-text files. I have not enough power to do this at the time. :-)

>> A request rate is 20-30 per second.

>Sorry, again a translation problem.
>Are you talking locks/second? io/sec? file >scans/sec?
"request" - it's a single run over the file from begin to end.

> Your advice will cause a quota increasing >much more then in the global buffers case. :(


>My suggestion is based on the expectation that >you can not cache the whole file and I am just >making the reading go faster.
I expected that GBC is help.


>Once you disabled record locking, as I believe >you did with NQL + RRL + NLK, I also happen to >believe that large buffers might re-allow you >to try GBC + MSE because you will have only >bucket locks and fewer of them!
Very interesting....


>The default is 16 block/buffer and you can >make it 127 blocks, thus reducing your lock >rate by a factor of almost 8. As I said, I >need to check back whether concurrent read has >become available for sequential files in >global buffers, but I doubt it.
Very interesting....

>Why this was not fixed?
Because "multistreaming with multiple FAB" has advantages over "multitresaming with multiple RAB" and don't cause the high lock rate. Is not? And if RMS "want" to performs a syncronization between multiple streams in a single process it may/can/must use low-cost IPC.
Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming

hein> As I said, I need to check back whether concurrent read has
Hein> become available for sequential files in global buffers,

What version of VMS have you been testing with?
I see now that the concurrent read fix I alluded to was implemented in 7.3.

Check out the 7.3 New Release Manual chapter 5.12.1.1 "RMS Global Buffer Read-Mode Lockin"
http://h71000.www7.hp.com/doc/73FINAL/6620/6620pro_005.html#rms_buff_topic

It sure sounds like this would fix your performance problem completely. Just use Global buffers, with enough buffers to cache the file: so 384/16, rounded up to be sure to say 30
(do a SHOW RMS to check the MBC is 16, 7.3-1 bumps it to 32).
You can also take full control: first SYS$OPEN,
check the ALQ field, device by MBC, round up and plug into GBC before doing the connect.
At sys$connect RMS looks in FAB$W_IFI and FAB$W_GBC, from teh RAB$L_FAB pointer.

hein> I also happen to believe that large buffers might re-allow you
hein> to try GBC + MSE because you will have only bucket locks and fewer of them!
>Very interesting....

You are too kind. You can just say that my last remark made no sense. I was all wrong there (or at least untill CR locks are active) because RMS will get and release the bucket lock around each record operation, so the number of records per bucket does not matter much.

So Try GBC + MSE under 7.3 (or better!)

Hein.




Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

>What version of VMS have you been testing with?
>I see now that the concurrent read fix I >alluded to was implemented in 7.3.
7.3-1/Alpha and 7.3/Alpha.

>You can also take full control: first SYS$OPEN,
>check the ALQ field, device by MBC, round up >and plug into GBC before doing the connect.
Did it.

>So Try GBC + MSE under 7.3 (or better!)
Sorry, but a first time when we saw the problem was after upgrading from 7.2-1 to 7.3.


>Hein.
Many thanks for your time, Hein.
Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming


hein>So Try GBC + MSE under 7.3 (or better!)
>Sorry, but a first time when we saw the
>problem was after upgrading from 7.2-1 to 7.3.

Well that is really annoying because that's when it was supposed to get better.
You really _did_ have global buffer on huh?
Did you verify with say INSTALL LIST?
Or with ANAL/SYST... SHOW PROC xxx/RMS=(FAB,GBH)?
How about switching rms stats: SET FILE/STAT
Followed by MONI RMS (or my rms_stats freeware)

Anyway, this all sounded sufficiently suspicious that I forwarded this topic to RMS Engineering and here is the reply (thanks Bob!)
------ begin -----
I took a look at the discussion, as well as the customer's
example code. His LCK1.C program which is setting the
MSE bit in order to allow global buffer use looks like it is
indeed utilizing the new (V7.3) system level fork lock for
concurrent reads as expected. In executing his program,
I find that there are no locks being accessed once the
initial system level CR locks are obtained (using MONITOR
DLOCK/LOCK).

Obviously, this only occurs if global buffers are enabled on
the file since the system level fork lock for concurrent reads
is only implemented on files with global buffers. Without the
global buffers enabled, I see the expected lock conversions
(5000+ per "monitor lock" sample interval on my system).

His other settings in addition to MSE sharing and global buffers
set to a large enough value to contain the file (nql, rrl, nlk) are
necessary as well. Without these settings, we see the typical
enq/deq of the query lock.

Is the customer certain that he has performed his testing with
global buffers set to a nonzero value? I had to manually set
the file since the program itself does not specify a GBC.

As far as I can tell, the new V7.3 features are working as designed
and locking is virtually eliminated with the No Query Lock and
system level concurrent read fork lock. I actually performed my
testing on an Opal release (X9VW) using his lck1.c program, but
I am not aware of any problems with the V7.3-V7.3-1 releases
that would prevent this.
------ end ------



Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

Hi Hein, Bob.

Can you explain me what is difference in use locks between "multi-FAB" and "multi-RAB" accessing to a shared file ?

Thanks.

Hein van den Heuvel
Honored Contributor

Re: DLM & RMS Multistreaming


Multi-fab sharing within a single process is exactly like multiple processes sharing the same file. The will be no awareness that it is in fact the same program. So RMS will map the global buffers (and optional statistics section ) multiple times and so on.

Multi-rab sharing allows rms to share the same local and global buffers (and bdbs and so on) but just have independednt RABs and Irab to be able to retain independend context (notably for $GET sequetial, update and deletes).

As an addeed twist, SHR=GET + FAC=GET would normally NOT use global sections, but toss in MSE and it will (even if there is really only a single stream).

Note: While I'd love to see a resolution posted here evenatually I believe we are now (back) to a point in the discussion where formal support wil need to be engaged, if a probblem remains.

In other words: This is all the 'free advise' you are going to get from me, bob,...

Cheers,

Hein.

Ruslan R. Laishev
Super Advisor

Re: DLM & RMS Multistreaming

I see some other interesting effect:
http://starlet.deltatel.ru/~laishev/lck.c - this prog. shows DIRIO:0 and BUFIO:0 counters after first thread ran over the file. It's looks as expected.

But http://starlet.deltatel.ru/~laishev/lck1.c - permanetly shows non-zero DIRIO. The test plain-text file is about ~300 blocks, global buffers count = 512. What is a possible reason to performs DIRIO?

Thanks, Hein.