Operating System - OpenVMS
1752591 Members
2831 Online
108788 Solutions
New Discussion юеВ

Re: How to optimize subroutine placement in DCL

 
Wim Van den Wyngaert
Honored Contributor

How to optimize subroutine placement in DCL

Test1 : 10000 gosub x. x is directly after the gosub code. Result : 3933 direct IO's.

Test2 : idem but x is 4200 lines of DCL down the script. Result : 23947 IO's. CPU went up by 15% compared with test 1.

It didn't read the DCL procedure 10000 times. What is the algorithme of the caching / reading / searching DCL ?

Wim

Wim
33 REPLIES 33
Wim Van den Wyngaert
Honored Contributor

Re: How to optimize subroutine placement in DCL

x is directly after the gosub code.

must be
x is directly after the gosub call.

Wim
Wim
John Reagan
Respected Contributor

Re: How to optimize subroutine placement in DCL

Did those IOs turn into real disk reads or did XFC have them in cache?
Jess Goodman
Esteemed Contributor

Re: How to optimize subroutine placement in DCL

I'm not certain exactly what your question is, but I can confirm that in a very-long DCL command procedure there is a very signficant performance advantage to putting a GOSUB routine close to the corresponding GOSUB statements, even if you have to branch around the GOSUB routine with an ugly "GOTO label" before it and "label:" after it.

If the GOSUB routine is at the bottom of a long command procedure DCL apparently re-reads the entire procedure with every GOSUB statment in order to find the GOSUB label. My guess is that DCL only caches x number of statement labels.
I have one, but it's personal.
Wim Van den Wyngaert
Honored Contributor

Re: How to optimize subroutine placement in DCL

The IO was satisfied by VIOC. Will test on Monday with XFC.

Wim
Wim
John Gillings
Honored Contributor

Re: How to optimize subroutine placement in DCL

Wim,

Stating the bleeding obvious...

If you want optimized code, use an optimizing compiler, NOT DCL.

DCL has some syntactic peculiarities that mean it isn't possible to just record the location of every label for easy jumping. Nature of the beast. Ever seen a dynamic label? It's perfectly legal syntax and works "mostly" as you'd expect. Also consider IF THEN ELSE scoping, SUBROUTINES, forward and backward references and potential duplicates. Again, this is the nature of an unstructured, untyped, interpreted language.

IMHO 4000 lines of DCL in a single procedure is insane. Restructure it into multiple procedures for better performance, and easier debugging (simple experiment, find an ENDIF statement after the first 100 or so lines, remove the leading "$" and see how long it takes your best DCL programmer to find the bug).

DCL is a fine language for some things, but when you find yourself banging up against its inherent limitations, it's time to reimplement your program in a more appropriate language.
A crucible of informative mistakes
Hein van den Heuvel
Honored Contributor

Re: How to optimize subroutine placement in DCL

>> What is the algorithme of the caching / reading / searching DCL ?

DCL remembers RMS RFA's for labels, so it can tell RMS to 'jump' to the right spot.

DCL uses RMS in Process IO context with a single buffer of an OpenVMS version dependent size. In my case (Alpha 8.3) that was 4096 bytes.

If the calling line and ENTIRE call/gosub code executed is not it the same buffer, then read IOs will happen.

So just go see for yourself what it does!
Start op two sessions:
1:
$SET PROC/NAME=TEST
$@TEST

2:
$SET PROC/PRIV=CMKRNL
$ANALYZE/SYSTEM
SDA> SET PROC TEST
SDA> SHOW PROC/RMS=(PIO,NOIFB:3,RAB,BDBSUM)

Suggested test.com (verbatim!):

$ Loop:
$ Read/promt="Go Far? " sys$command x
$ if x
$ then gosub far
$ eLse gosub near
$ endif
$ goto loop
$ near:
$ read/prompt="Return from near? " sys$command x
$ return
$ x=1 !The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
$ x=1 !The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
:

:
$ x=1 !The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
$far:
$read/prompt="Return from far? " sys$command x
$return


For bonus appreciation and learning, convert the above to an INDEXED files. (Spelling Magic involved here!).

$ conv/fdl="fil; org ind; key 0; dup yes; seg0_l 4" test.com test_idx.com
$ type test_idx.com
$ diff test.com test_idx.com

Now try again.

You'll see you get 2 buffers, each a bucket (default 2 blocks) big. But 1 of the buffers is used for the index root has a cache value keeping it there. Thus you'l get the same, or more, IOs depending on the exact fill of buckets.

Are we having fun yet?

groetjes,
Hein
Hein van den Heuvel
Honored Contributor

Re: How to optimize subroutine placement in DCL

I wrote "verbatim!" to honor spaces and case, but I meant 'mostly verbatim' ;-)
(een beetje zwanger).

The part with:
:

:

Is supposed to represent a bunch of the long lines where a bunch is larger than 50 such that we exceed 4000-ish bytes using 100 of those 35-letter pangrams + stuff.

Hein.
Wim Van den Wyngaert
Honored Contributor

Re: How to optimize subroutine placement in DCL

Same result with XFC (7.3). XFC prevents the IO's.

John G.,

The procedure is monitoring the systems and is running for 8 years without major problems. But it has very strict debugging built-in (1 warning and it aborts+restarts saying in which part it aborted -> maximum 100 lines of DCL are possible). The advantage is that ANY system manager can read/debug/modify it. In contrast with the pascal and fortran programs we have too. It also has the advantage that it doesn't create extra processes (I pase output of ncl, tcpip, ...). BTW : already found that subprocedures (@) and calls are more expensive than gosubs.

So, no way I will go to a programming language.

So, just trying to optimize my Fiat (or whatever small car you have is .au), not planning to buy a Mercedes.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: How to optimize subroutine placement in DCL

Still don't get it.

Tried only 1 gosub. Directly after : 42 IO's. 4000 lines further : 56 IO's. The procedure is 248 blocks on disk. Did it read the entire procedure in about 10 IO's ?

The gosub only contained return. So I would expect that read by RFA only requires 1 IO. Then why did I get a punishment of 20000 IO's in the original test ? 2 IO's for each RFA read ?

And how to explain the 3933 IO's from test1 ? If it had to read something again, I would expect at least 10.000 IO's.

Wim
Wim