Operating System - OpenVMS
1753288 Members
5351 Online
108792 Solutions
New Discussion юеВ

Re: About perfomance with RMS relative files...

 
SOLVED
Go to solution
Hoff
Honored Contributor

Re: About perfomance with RMS relative files...

Will it hurt or help? I don't know.

That determination would require some tests.

My guess: "not much".

Sitting here, I don't know how your box is tuned and I don't know how your disks are initialized (in some cases, having clusters aligned on 16x block boundaries can have a performance benefit within the processing of some of the storage controllers; there's a knee in the block alignment), and I don't know what sort of application I/O pattern is in use here. I run these tests regularly during I/O and performance investigations, looking for the "knee" in the throughput curve. In this range of blocks, I'd not expect a big effect either way. Try it. Check the data. Tweak the input. Try it. Ugly. Slow. Expensive.

Finding the knee is easiest with the I/O monitoring tools loaded. Otherwise you end up instrumenting the code itself -- which certainly isn't a bad thing -- to collect performance data. But that adds schedule time and adds cost requirements.

Unless you are right on a performance boundary (one of those knees I mentioned earlier) or unless you're tossing massive amounts of data or unless your I/O load has changed substantially since the last tuning pass, small and incremental adjustments in RMS usually don't have a big payback once the initial system and file tuning passes. Application changes and hardware changes tend to have a bigger payback, and on-going tuning (save in these exceptional cases) gets you marginal results. (And an AlphaServer GS80 says "you've been here a while"...)

It is often difficult to work around administrative and managerial matters and roadblocks using technical means; technology isn't good at that class of problems. Put another way, you need either a prototype platform or a testing window on the production gear, and you need to have the tools available or you need spend the time (and the budget) replicating what the tools can provide.

If you know the number of entries you're going to get each day, then going completely virtual I/O and to one (or two) blocks per entry (or whatever) would be the fastest I/O; you'll have most of RMS out of the way. It may or may not be the most efficient path in terms of caching and related, however.

I'm still not entirely certain what you're optimizing for here, too. I see both EVA load, and application speed. Small random I/Os are about the worst. Sequential I/Os are somewhat batter when there's no contention, and caching or in-memory storage is the best. Assuming large quantities of data and using P2 space, for instance, and running it all out of application memory tends to reduce the I/O activity. This is where I toss the data journal remotely.

If there were a single "go fast" button for application I/O, it'd very probably already have been pushed for you...

Bob CI
Advisor

Re: About perfomance with RMS relative files...

>>>> I'm still not entirely certain what you're optimizing for here, too. I see both EVA load, and application speed. <<<<<<

Really, we want to ajust to the recommendations of our system manager about smallers ios with EVA, but without damage in our writing rates with our hot relative files....

We want to decrease file bucket size, but bosses want to see how the numbers in the monitoring tools confirm that there isn├В┬┤t risk of performance loss with this change.

What parameters we must to check in MONITOR (RMS or whatever) to confirm this event, apart of our own report programs ????

For bosses is very important to confirm this, not only with our report, but with monitoring tools too.

Thanks!
Hein van den Heuvel
Honored Contributor

Re: About perfomance with RMS relative files...

Let me summarize this.
- folks do not know how to use existing tools
- folks refuse to add other tools
- tools must be used.
- real information,the measurements from the application, may not be used.

Something is gonna have to give. Probably you.
Unfair, but that's how the cookie tends to crumble.

Lower is probabably better.
You will have to try it in a test.

Need a counter argument?
RMS will pre-read before the write.
So with a 12 block bucket, and a 500 byte record size you will do 1 read-IO for each 12 writes. Now if you go all the way down to a 1 block bucket, you'll need 12 reads an 12 writes for the same amount of change.
While the XFC will mitigate this, and can possibly be tricked to completely hide the read IOs, it will still cost.
EVA Write IOs are fast... say 1ms.
EVA reads CAN be fast from cache, but may well be slower than writes.
For sake of the argument, let's say a read IO is as fast as a write (it is not).
The benefit of shorter writes is NOT proportional to the write size. Let's just say that a 12 block write might just be 2 imes slower than a 1 block write and a 4 lock write 1.5 times slower. And for fun lets call a 12 block write 1ms.
Now do the math in an excel spreadsheet:
12: 1R + 12W = 13ms
4: 3(R/1.5) + 12(W/1.5) = 10ms
1: 12(R/2) + 12(W/2) = 12ms
... maybe.
Toss in a few slow reads and the 12 block size wins.

You may just needs a good performance consultant to break through all this.
I happen to know a few suitable ones (your system manager may disagree :-) if interested.

Cheers,
Hein van den Heuvel
HvdH Performance Consulting.

Hoff
Honored Contributor

Re: About perfomance with RMS relative files...

Hein missed one: quit. Seriously. If the place is as screwed up as it appears from your postings here, it's time to tune up your resume and look for different (and better) employment. That, or check your cares at the door and take pleasure in simply watching the organizational disfunction and self-destructive behaviors. You're splitting the distance between these two career options here (I can tell from the way you're assigning the points) and that's usually on the impeding cardiac path.

Bob CI
Advisor

Re: About perfomance with RMS relative files...

Hein, thanks for your explanations!!!!

It├В┬┤s the kind of things that nobody teach us!!! And I don├В┬┤t know any documentation that explain it.

But I have a doubt about read and write cycles that you comment.

RMS must to read the bucket although the record to be inserted was new ???

In my relative file, there is never update operations, always are insert operations, since sequence 1 until sequence 2000000 ( for instance ), and in this ascending order.

I├В┬┤m sorry if i don├В┬┤t know enough.....

Thanks, thanks...and thanks....
Hoff
Honored Contributor
Solution

Re: About perfomance with RMS relative files...

>> It├Г ├В┬┤s the kind of things that nobody teach us!!! And I don├Г ├В┬┤t know any documentation that explain it.

Application and system performance and tuning is still an experimental regimen. The books (the good ones) will teach you how to run tests, and the books tell you to run the tests. The performance management manual in the OpenVMS documentation set tries to explain the general principles here, for instance. (As the OS crowd finds generic go-fast fixes and tweaks, they tend to implement those, too.)

Performance varies from one device to the next. From one hardware generation to the next. From one firmware revision to the next. From one application to the next. Sometimes massively. And you find surprises, such as the benefits of cluster alignments on some of the controllers.

Sharing and clustering can both slow the I/O rates, too.

>> But I have a doubt about read and write cycles that you comment.

Go try it. It's distinctly possible that Hein might be wrong here, but I'd not bet against him.

>> RMS must to read the bucket although the record to be inserted was new ???

RMS has to figure out where to put the new data, and particularly whether it needs to extend the file. And (if you're sharing the file) coordinate access.

If you roll your own record management using direct virtual I/O, the XQP will still have to perform a few reads on your behalf to figure out where to put stuff. These reads usually happen out of cache. And you'll have to track file positions.

>> In my relative file, there is never update operations, always are insert operations, since sequence 1 until sequence 2000000 ( for instance ), and in this ascending order.

Reduce the numbers of I/Os by going larger (caching and coalescing), or try tweaking and testing (requires embedded diagnostics and/or monitoring tools). Or go to faster hardware.

Two million records (and two million blocks) isn't all that much. Run the spiral transfer rate of your disks (or use the sustained rate on the particular EVA in this case) as limited by the FC SAN speed, and that'll tell you what the upper boundary here is.

Sharding is another option for I/O-limited cases; to have even record I/O go to one spindle and odd to another, for instance.

The basics of any performance effort first involves establishing a baseline, and to then run a series of tests. Tweak one thing within the application, run a test, and compare the results against the baseline. This is a very raw example of the empirical method. (With this sequence, the occasional read I/O while writing would be visible, for instance.)

For general application tuning (and there can be all manner of performance-limiting weirdnesses within the average application), something like the DECset PCA Performance and Coverage Analyzer or similar. The system-wide tools are good, but not as good (or as easy) at spotting hot routines. And I've seen cases where the tool itself tosses reads -- such as getting the next index entry, for instance.

Hein van den Heuvel
Honored Contributor

Re: About perfomance with RMS relative files...

>> Hein, thanks for your explanations!!!!
It├Г ├В┬┤s the kind of things that nobody teach us!!!

There are/were OpenVMS RMS training courses.
Bruden possibly being the best offering today. But I sure many others like Parsec and consultants will do a fine job.
I'm sure even Patrick O'Malley is willing to dust of his offerenings.
Far too few folks take or have taken those courses.

>> And I don├Г ├В┬┤t know any documentation that explain it.

The OpenVMS Guiode to file applications makes an attempt.

>> But I have a doubt about read and write cycles that you comment.

Me too! :-).
It's just a guess to start working.
Write times may well be faster (write-back caching).
Read times... from the spindles... may well be slower.

>> RMS must to read the bucket although the record to be inserted was new ???

YES. For pre-allocated relative files.
That's why my very first suggestion was

"You may want to review whether the application actual uses any of the relative file semantics"

But don't ask me... check for yourself.
$SET FILE/STAT
$MONI RMS/ITEM=CACH/FILE=...
(or better still, use my RMS_STATS when the system manager is not lookign or ANAL/SYSTEM... SET PROC ... SHOW PROC/RMS=FSB.

When watching rms stats, be sure to check the locks.
For the larger bucket sizes you'll see few ENQ/DEQ (1 per 12) and 2 bucket lock converts per record put. With 1 block bucket size you'll see 12 times more ENQ+DEQ

RMS Stats will NOT provide timing data, allthough the rates may give yo a clue.

>> In my relative file, there is never update operations, always are insert operations,

You may think that. And that is what it look like. But inreality it is the other way around. There are no puts to relative files. Only updates. ($PUT+UIF)

So why on earth would the application use a relative file with its overhead?
1) possibly someone like chosing the 'go slow' button...
2) probably to make programming seemingly easier.

>> since sequence 1 until sequence 2000000 ( for instance ), and in this ascending order.

Relative files are initialized on creation.
Relative file records are placed in buckets.
They do not live in isolation.
So the system must calculate a target bucket, read that bucket, merge in the record, and write the bucket for any existing bucket.

>> I├Г ├В┬┤m sorry if i don├Г ├В┬┤t know enough.....

So spend some time (and money?) and fix that!

Thanks, thanks...and thanks....

Cheers,
Hein.
Robert Gezelter
Honored Contributor

Re: About perfomance with RMS relative files...

Bob,

I was a bit busy for the past few days, but noticed that one aspect of this conversation appears to have been skipped: Is it possible to switch the output-only file to a straight sequential file, with the conversion for later processing taking place offline.

Conversion of a sequential file to a relative file on a background basis may be a better option. The combination of multi-buffering, multi-blocking, and deferred write (with a close/re-open) can deal with high write rates.

I would also consider including Hein's comment about closing and re-opening the file, although I would go further and recommend saving the File ID and doing the re-open by File ID, it saves the directory lookup.

I would also suggest checking where on the EVA the "volume" is provisioned. In a traditional disk, I would be questioning whether arm movement is an issue. On an EVA, the question is more complex.

And with the obvious disclaimer that our firm offers services in this area, I echo the thought that calling outside assistance may be a good way to eliminate a degree of uncertainty how how the various aspects interact.

- Bob Gezelter, http://www.rlgsc.com
Wim Van den Wyngaert
Honored Contributor

Re: About perfomance with RMS relative files...

I created a batch job that runs in high prio on a 4100 with 7.3 and dual hsg80 (dus IO completes when controllers have received the io).

It creates a relative file with 10.000 blockss and then writes in dcl 9999 records to it.

I started it with all kind of settings for set rms/block=1|64/buf=1|16/rel
set rms/noglo
set file/cache=no
bucket size=1

and found no big difference (<2%) on any of the settings.

Opening the file /shared=read instead of /shared=write improves cpu consumption by 10% but wall time is 2% higher. No global buffers also makes it run 2% longer.

So, Bob's idea seem a good one.

Fwiw

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: About perfomance with RMS relative files...

Hmmm, did the same operation on a flat file.
Took almost the same time (CPU -15% and wall -2%) but 50% of the IO's were gone.

Wim
Wim