What is the impact of accessing an extra file

Jan van den Ende · ‎03-11-2004

Anybody have an idea about the impact of accessing an extra file vs. gosub-ing?

I am to assist in converting a free-text-database. (Basis+).
It has a query language that functions more or less like SQL (but it IS quite another beast!)

The problem: the trial runs for the pilot extrapolate to MONTHS of exporting! (I'm told DUMPing and converting the output cannot generate data in a useable format because that looses important connections; by lack of info I just have to accept that).
I have been looking into the exporting code (.PRC file, comparable to .SQL files), and PER processed "record", there are anywhere from 10 to over 100 activations of other .PRC files (all-in-all 20 different ones).

If the number of records goes over 100000 for the pilot, I guess the activations might become relevant ( ?? ).
Anybody an educated guess whether I should tell them to consolidate the .PRC code into a single chunk, converting the separate-file activations to GOSUBs, or, will there be very little gain because caching will provide?
Formulated another way: is the overhead of switching to code read from a separate (in-cache) file & returning comparable to GOSUB & RETURN, or is it significantly higher?

tia,

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Hein van den Heuvel · ‎03-11-2004

I think you are right. 10 - 100 file opens per 'record', even from cache, would take excessive time. Probably it is just CPU time, but maybe there is a deaccess file header update on close?. I don't know anything about basis+, but like you said, maybe they can 'gosub' withing a script?
Maybe you can write a (perl) script to transform multiple nested scripts into a single inline on? But really what you need is a script that loops over the records, staying active. Does basis have a concept of 'stored procedures' perhaps?

Here is a silly thought... who 'deep down' are those included prc files. RMS will have a directory cache with names and (did) numbers to drill down quickly, and the xqp will have the filename to fid for the final lookup step in all likelyhood (I never really studied how that works) but it would seem to me that how deeper (nested directories, searchlist, size of leaf directory) the files is the longer it takes.
For the purpose of a long running convert, I might be tempted to experiment with moving those 10 - 100 frequently opened and closed files into an explicit $x$nnnn:[000000] (on a ram disk?) location with no logical names or searchlist involved. At least it would be an intersting experiment! Let us know!

Groetjes,
Hein.

Hein van den Heuvel · ‎03-11-2004

I just did a quick (& incomplete) test opening closing and reading a file down in a 6th level subdirectory versus the same file in the root directory, and find no significant difference. With DCL it is less then 10%, with Perl no measurable differnce.
With DCL I see kernel/exec/super/user=57/23/19/0
With Perl that is 62/20/0/18 for master, and 64/20/0/16 for deep suggesting the master location is a little easier on the system: more user work is done.

Silly scripts below.
Hein.

--- perl ---

$f = "[.tmp.tmp.tmp.tmp.tmp.tmp]tmp-sub.com";
$f = "_dra0:[000000]a.com";
($sec,$min,$hour) = localtime(time);
$t -= $sec + 60 * ( $minute + 60 * $hour );
while ($i++ < 10000) {
open (X,"<$f") || die "($i) Failed to open $f";
while (){};
close (X);
}
($sec,$min,$hour) = localtime(time);
$t += $sec + 60 * ( $minute + 60 * $hour );
print "Elapsed time = $t seconds.\n"

---- dcl ----
$
$
$ old_cputim = F$GETJPI ("","CPUTIM")
$ old_dirio = F$GETJPI ("","DIRIO")
$ old_bufio = F$GETJPI ("","BUFIO")
$ old_time = F$TIME ()
$!
$! Run the test
$!
$i=10000
$loop:
$!@dra0:[000000]a.com
$@[.tmp.tmp.tmp.tmp.tmp.tmp]tmp-sub.com
$ i=i-1
$ if i .gt. 0 then goto loop
$!
$! Save the end time
$!
$ end_cputim = F$GETJPI ("","CPUTIM")
$ end_dirio = F$GETJPI ("","DIRIO")
$ end_bufio = F$GETJPI ("","BUFIO")
$ END_TIME = F$TIME()
$
$ old_elapsed = ((F$EXT(12,2,old_time) * 60 + F$EXT(15,2,old_time)) * 60 -
+ F$EXT(18,2,old_time)) * 100 + F$EXT(21,2,old_time)
$ end_elapsed = ((F$EXT(12,2,end_time) * 60 + F$EXT(15,2,end_time)) * 60 -
+ F$EXT(18,2,end_time)) * 100 + F$EXT(21,2,end_time)
$ elapsed = end_elapsed - old_elapsed
$ if elapsed .lt. 0 then elapsed = elapsed + 24*60*60*100
$ WRITE sys$output -
f$fao ( "Dirio=!6UL Bufio=!6UL Cpu=!6UL Elapsed=!6UL ", -
end_dirio - old_dirio, end_bufio - old_bufio, -
end_cputim - old_cputim, elapsed )
$exit

Uwe Zessin · ‎03-11-2004

Hein,
you did run with XFC enabled, didn't you?
Did you flush any caches between tests?

.

Hein van den Heuvel · ‎03-11-2004

Uwe, Like I indicated, this was not a serious test. Serious benchmarks are work!

It was actually an old AlphaServer 1000 4/266 with 7.1 setup with VIOC set to the default 3MB. No caches flushes, all files small and surely in memory. Uncontrolled, but minimal, other activity on the box.

You gave me an opportunity to add a line I forgot... the test did nothing (the dcl sub just did an exit, perl just read the exit line) and it showed only minimal (cpu time) differences. If the core code did actual work, then the file activity change would look even smaller, relatively speaking.

I just was curious to know whether it would show a difference at all! If Jan is truly interested in this pursuing this path it gives him an initial test to try to set a baseline/expectation. But it sounds like he does not need a little tweak / speedup, but he needs a different approach and is struggling with limiied basis+ skills.

Cheers,
Hein.

Jan van den Ende · ‎03-11-2004

Hein,

yes, my Basis+ skills are not really of daily practice. :-(
Your measurements of the many activations though do point to the fact that the overhead is not that big, and would not really ask for the extra work of rewriting the procedure complex.

At our VMS SIG meeting this evening I did have some discussion about it with Willem Grooters, and we arrived (without the measurements, but with more background info) at about the same conclusion.

Some more background:
the database is about 4 Gb in about 20 files (1 per table, except the biggest table split over 2 files because Basis+ hardcoded limits file sizes to 2 G), and about 4 G index files, also 1 file per table.
System: Alpha 4100; 512 Mb; 2 SCSI Ultrawide buses w/ 1500rpm disks,
data files on 1 disk, index files on another, program files a 3rd, and output files 4th disk.
Machine temporarily allocated to this project, only system load being this job + some sysmgt people looking at it.
CPU load <10 %.
I/O load 50-60 (and we have seen this machine at 500 sustained, on completely unrelated work).
Disk I/O queue length average on most busy disk < 0.1
Willem and I concluded that the system is mainly waiting because of I/O latency.
If we would do the full export (not scheduled for pilot, but will have to be done) there are 5 (or 6?; can't check now from home) separate jobs. They probably could run in parallel with hardly any interference.

We even (conceptually) got a 'solution':
get a system with enough memory to get database + index (maybe as RAM-disk?) + dbms-workingsets in memory.
I do however have an idea about the budget holder's reaction...
On the other hand, on a loan basis, for a prestige project...?

I'll keep you posted, but results may take some time.

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Hein van den Heuvel · ‎03-11-2004

The CPU/IO usage observations are valueable inputs make progress. Like you said, you may be able to brute-force the issue by simply having multiple parallel streams. Always a good idea! Your first order of business should be to check whether the DB (Basis) has any form of async io settings/slaves.

50 - 60 Io/sec suggests synchroneous IO, non-write-back-cached writes or mostly random reads. You need to figure this out. What is the read/write mix?
Those disks, are they direct connect or behind a semi-intelligent controller?

If the problem is reads, then you need to figure out whether you can make a single read do more work (larger IO buffers), whether you can bring more order in the reads (pre-sort?), whether you can have some sort of read-ahaed (some controllers will do this).

If the problem is writes, then a good, relatively cheap solution might be to get a 'swxcr/kzpsc' or some such and put the disks behind it and enable writeback caching!

Heck, for the purpose of a (restartable) concersion it may even be acceptable to enable the scsi disk internal write-back caching albeit unprotected! Just flip the bits on the 'control pages' with a tool like rzdisk, or scu!

If it is random reads that is killing you, and you can recognize smaller datasets then indeed with a large (4GB or more) machine you may be able to suck them into a memory disk (in minutes) do the convert. stop db. move next chunk into memory, restart db, convert next chunk. Like you say, get someone to make an 8GB ES40 or ES45 available for the good cause for a while huh?!

Sterkte!
Hein.

Martin P.J. Zinser · ‎03-11-2004

Hello Jan,

since I/O seems to be the problem one other avenue you might think about is throwing more disks at the problem. Writing seems not to be the problem, so I would have a look at the data files. Also since this seems to be RMS indexed files, have these been maintained before you started this project. As Hein will be able to tell you, this can make a huuuuuge difference ;-)

Greetings, Martin

Willem Grooters · ‎03-11-2004

To explain the whole thing a bit from the programmer's persepctive ;-)

What I understood from Jan is that the program will read a record, analyze it and then conclude what to read next. This record, in turn, is processed the same way. This way, the whole 'tree' is traversed - branch by branch, leaf by leaf. Pretty recursive, formally seen. And if this is done sequentially, count the seconds...

I have already suggested that splitting up the process of extraction into several parallel streams (like suggested in this thread already), or use a completely different way of processing (distributed); both would require programming (OT: Jan, je weet me wel te vinden)

Something just come into my mind because of this: Check your (virtual) memory. I don't think this is very probelamtic but seen the way the program works, there could well be a problem in that area as well.

Willem

Willem Grooters
OpenVMS Developer & System Manager

Jan van den Ende · ‎03-11-2004

Hein, Uwe:

Willems control flow is essentially correct:
Read a record, evaluate field(s) for entries in other table(s), read those, Like Willem wrote: travel every branch and every leave, and then move to the next 'tree'.
Travel tree in 2 - 3 minutes, about 100 000 trees in pilot 'forest', > 4 000 000 in whole 'country'.

Martin:
NO RMS-indexed files.
Basis+ is essentially for various Unix flavors, including VMS.
All files are sequntial-fixed length 2048.
ANY structure is defined & handled by db mgt
software.

btw, an omission in my previous stats list:
SHO MEM/CACH:
63 % read hit
1 % write hits
read count > 200 * write count

database journalling disabled

Hein:
disks directly connected.
teh way the database was designed (way back when) all database files are located in one directory (-logical name), same for index files.
-- come to think of it, maybe with search list can be spread over multiple disk?
Anyway, by the nature of the algorith I don't there is much gain to be expected.

btw, I did not yet explain the (really sad) reason for all this:

IF things are successfull, this will end with another major VMS application being replaced by a *NIX db / Billyware RSI-gegerating frontend applic.
And, judging the expence (money and effort) already invested, the 'political prestige' involved, and previous experiences, probably that what will become available will be DEFINED to be called successfull. In this past, it has already occurred that (to abuse an Asterix formulation): The opinions were devided. Management itself called it success, the rest of the community did not think so.
Anyway, we still try to be as helpful as possible,or in the end WE might get blamed for not being able to deliver the necessary starting info, and so causing delay/failure/inconsistent data/can you think of any other blame.

I sound sad?
Maybe all this is giving me a sad mood, but hay, it's not the end of the world!

I'll keep you informed.

Don't rust yours pelled jacker to fine doll missed aches.

labadie_1 · ‎03-11-2004

Jan

what is the name of the database ? Rdb, Oracle, Adabas, Sybase... ?

if it is Rdb, I am sure some Vms/Rdb wizard can do "marvel" on such configuration.

Jan van den Ende · ‎03-11-2004

Gerard,

like mentioned above, the db is not one of the generally-known.
It is Basis+. A database system optimised for full-text functionalities.
Produced by IDI, Dublin, Ohio, USA.
Now owned by OpenText.
Their European headquarters is in Paris.

Jan

Don't rust yours pelled jacker to fine doll missed aches.

labadie_1 · ‎03-11-2004

As a friend of mine says, "the best I/O is the one you do not do", so as said before, add memory, put the database in memory, and reduce dramatically the number of I/O.

Of course, if possible with your configuration :-)

Hein van den Heuvel · ‎03-12-2004

Jan/Willem, dit is wel een interessant problem. Als je het nuttig lijkt kunnen we er misschien even een telefoontje aan wagen? Stuur me dan een mailtje.
Groets,
hein apestaartje hp punt com.

Uwe Zessin · ‎03-12-2004

Hein,
is that dutch, danish or is somebody suffering from line n01se? ;-) ;-)

.

Jan van den Ende · ‎03-12-2004

Uwe
why did you net guess Swahili? :-)

( it's dutch)

Hein, bedankt voor het aanbod, ik hou het in beraad.

Everybody interested:

Basis+ DOES allow subset DUMPs. (continuing the analogy: uproot ONE tree).

We dumped the pilot subset, closed down db, renamed data & index files, generated a new, empty, set of much smaller files; and LOADed teh dump. (Few hours).
Then we restarted the export on this.
Let it run over the weekend.

One real difference already noted: CPU > 95 %, average DIO: ~600.
So NOW it looks like machine capabilities limet the performance.

--to be continued --

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Willem Grooters · ‎03-14-2004

Jan,

As I understood, there is "data" and there is "index" - probably in a unix way, not RMS.
It would be required to find out what the causes the queues. Not unlikely the disks are accessed directly, based on search criteria.
To really tackle the problem, it might be needed to scrutenize the software, the method of accessing the indexes in particular.
Hein: would be a nice point - but the only relationship between Jan en me is our interest in, and dedication to VMS. But I would like to partiipate (Jan knows)...

Willem Grooters
OpenVMS Developer & System Manager

Jan van den Ende · ‎03-18-2004

Well,
it starts to look like this stream should carry another title.

The original question arose from the fundamental tuning main rule:
"Redesign your application!"

And the first thing I noted when seeing the export script was the amount of invocations of part-scripts in separate files, and realising that being within often-executed loops that would amount to HUGE amounts of file access-deaccess activity.

Thanks to Hein for demonstrating that the overhead was relatively futile.

The prestige of the project did allow us to convince project-management of the need to hire dbms-specific & application-specific expertise.

With their help we were able to demonstrate that one part of the app did a (not really big, but significant) sequential table scan...
To get back to the forest - tree - brach - leaf analogy:
IF we reached a leaf that was to be selected, THEN all other selected leaves where scanned to test for 'duplicates', ie, if a certain item was referenced by multiple paths. (now the analogy fails: in a tree a leaf is not attached to multiple branches).

Creating an extra table, entering all found leafs in there, and doing a hit/nohit check on that table reduced the running time to about 1 or 2 percent.

It was also shown that the use of the indexes was now practically zero.
In retrospect, that fits perfectly: there now is no more " select where = ", but only " get from where is ".

So, it looks like at least the running-time problems of the export-phase are now under control.

The whole project moves on to other 'challenges', and I guess our activity will reduce to running this sequence again when they reach the point of activating the pilot seriously with 'then' real-live up-to-date data.

Thanks all for your help!

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

What is the impact of accessing an extra file

What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file

Re: What is the impact of accessing an extra file