Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Optimum cluster size for EVA ?

SOLVED
Go to solution
Guinaudeau
Frequent Advisor

Optimum cluster size for EVA ?

hi,

we will upgrade soon one VMS customer to use EVA (4100) with ES45. that customer runs V8.3.

our application has a large proprietary database, where

-> files are GB large and pre-allocated, no extension

-> files are ever opened, IOs are $QIOW aligned or not ; some IOs may be 512 bytes large, larger IOs are 64K (the old limit) ; files will be aligned, IOs not ever.

IO load is a great thing by this customer, so we are trying to look for optimal perfs.

i red about cluster size and we would like to optimize the cluster size for the new drives located on EVA.

=> advice was modulo 4 ? is it ever correct ? or is it better with modulo 16 ? i remember the presentation at bootcamp by steve hoffmann and his remark for mod 16.

=> i assume DISKPERF would be helpfull if we can judge a ~ real load

=> i assume, since app runs today using MSA, that we can use MSA fib-channel extension to have statistics, at least about IO size, possibly about alignment ?

any idea about analyse ?

i assume the issues are :

-> HW chain between drives / controlers / ...

-> XFC caching size

and what else ?

thanks for any remark / comment / etc ... in advance

louis
22 REPLIES
Jan van den Ende
Honored Contributor

Re: Optimum cluster size for EVA ?

Louis,

I think I was at that same session by Hoff :-)
Not only his, but also various performance sessions, and also answers here in ITRC, by specialists as Hein, Guy Peleg, Bruce Ellis, and several more point in the same direction.
All stress the 8 Kbyte, 16 block (or multiple) as transfer unit for SAN.
Do not forget to alse make that the value for the various RMS blocks. Limit here is 127, so biggest multiple of 16 is 112. And of course, also the disk cluster size should be a muliple of the transfer.

You did not specify WHICH database, but those usually also allow/require such settings. Make sure those harmonise with OS settings.

hth

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.
Hein van den Heuvel
Honored Contributor

Re: Optimum cluster size for EVA ?

Summary... I don't think it is going to matter much at all, given all conditions you wrote (Thanks for a refreshingly well documented note!)

The modulo-4 was due to an EVA RAID-5 issue.
It could affect write speed.

Obviously, since the customer cares about performance RAID-5 will not be used right?

http://www.baarf.com/ - "Battle Against Any Raid Five" http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt

Does the application avoid the XFC cache? (QIO no-cache attribute ?)

If it uses XFC than SHOW MEM/CACHE/FULL will give a nice IO histogram.

If the system has been up for a while, then you may want to SET CACHE/RESET to clear those counters (not the data) and look again after an hour or a day.

If it uses XFC than you should opt for modulo 16 as it deals with 16 block cache lines.

Now XFC deals with VBNs, not LBNs so the cluster size only plays a role if the application aligns or blocks, based on the cluster size. RMS Indexed files are sensitive to this. Your application might not. Still... it wouldn't hurt. You might as well stack the deck.
Clustersize defines first-block-in-file LBN alignement. No more, no less.

Since you mention large (and therefor few) pre-allocated file I would encourage a large cluster size to make it easier for all components.
A nice 'round' number like 1024 or 2048 perhaps?

maye more later,

Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

hi jan,

the DB is not RMS working, purely QIO.

It is an internal product in our company, very specialized for our SCADA system needs, with fast access (read or write) to different cyclic data for example. several blocks are written every 10 seconds, and the DB should be red fast too for display of timeline data on several minutes / hours / ...

I recollect from your replay and other threads (at least using MSA, which has also been used by some of our customers) the 127 (or 112) size for the cluster.

louis
Jan van den Ende
Honored Contributor

Re: Optimum cluster size for EVA ?

Louis,

SCADA?
May or may not be like what I have been dealing with (15 years ago, so things change, and memory fades :-( )
but THAT SCADA was operating a mainly in-memory DB, with on-disk data only as backing store.
_REAL_ fast, also in VAX days, but the active part of the DB HAD to fit in available memory.
_IF_ that applies here also, the disk-access speed and volume is still interesting, but being able to tailer the important parts for permanent residence is the REAL issue,
Nowadays memory is rather cheap, and VMS v8,x has extensive means to exploit that.

But... does that apply for your environment?

fwiw

Proost.

Buvez un de moi.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

precision : the DB layed on RAID-0 and RAID-1 volumes (additionally, these are shadow members, but that is pretty clear with VMS ;-)
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

SCADA :

you are right, the in-memory is important for performances. the DB had some in-mem caching, but currently, the best caching (since it is larger than original caching SW allowed) is the XFC to read blocks (write caching is for sure neglected, sorry for that, but our SW would need much work to ensure this)

=> we obtain already 80-90% hit rate for the read in the large files i quoted, and that is great for performances, we use 1GB and more of main memory

louis

**************

nota : my reply in that system are very slow, several minutes each time => is it on my side something slow, or the forum machine itself => curious to read your reactions
Hein van den Heuvel
Honored Contributor

Re: Optimum cluster size for EVA ?

May we assume you already googled around and found documents like: http://h71028.www7.hp.com/ERC/downloads/5982-9140EN.pdf
"Virtual Array 3000 and 5000 configuration best practices"

>> but currently, the best caching (since it is larger than original caching SW allowed) is the XFC to read blocks

So maybe share a representative SHOW MEM/CACH/FULL output in an attachment at some point? Preferably not over days, but after being reset just before a typical production window, and displayed afterwards?

SET CACHE/RESET may sound scary, but don;t worry, it just resets the top level counters, not the file details, and certainly does not affect the cache contents.

=> we obtain already 80-90% hit rate for the read in the large files i quoted, and that is great for performances, we use 1GB and more of main memory

That's typical, maybe even on the low side.

I have a (perl) script to take the SHOW MEMO/CACH=(TOPQIO=x,VOLUME=y) into a CSV sorted hot-file list fro excel. Nice way to see trends/problems... imho.

>> nota : my reply in that system are very slow, several minutes each time =>

Are you refering to the ITRC?
The replies often SEEM slow, but it is really the confirmation that's slow. Be sure to use a second window and 'see' whether the reply is there already. Or jsut ^A, ^C to be sure to safe all reply text, and then hit the topic hotlink on top (here the underlined "Optimum cluster size for EVA ?") to check that the reply made it in.

hth,
Hein.
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

hein,

1) do you prefer / think better a

$ SHOW MEM/CACH/FULL

or a

$ SHOW MEM/CACH=(TOPQIO=x,VOLUME=y)

i did first command then reset and will in one hour redo it.

2) your perl script will be of interest, for sure. do you attach in the thread ?

yours

louis

***************

thanks for the tip with slow ITRC command. that is what i meant, slow ITRC reaction.

Hein van den Heuvel
Honored Contributor

Re: Optimum cluster size for EVA ?

>> 1) do you prefer / think better a
$ SHOW MEM/CACH/FULL

That will be good enough to satisfy the curiosity of readers like myself, and perhaps generate some good suggestions.

>> $ SHOW MEM/CACH=(TOPQIO=x,VOLUME=y)

That generates a lot more data and needs more context to interpret it correctly.
It is very interesting, but it soon turns into 'work' to interpret. Sure, attach it as well. Maybe someone 'sees' something! I mentioned it more as a hint for yourself to look at it.

>> 2) your perl script will be of interest, for sure. do you attach in the thread ?

The script is not 'rocket science' but still represents some major work to get it reasonably robust. It is work (art?! ;-) in progress. I'll attach a basic version.

I expect to present a 1 hour session on describing how to obtain, format and interpret the SHOW MEM/CACHE data during the 2008 OpenVMS Boot Camp.
See you there?
http://h71000.www7.hp.com/symposium/index.html?jumpid=symposium

Hope this helps some more,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

here the three files,

one at 16:30,

two one hour later 17:30 to compare.

louis
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

file # 1
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

file # 2
Hein van den Heuvel
Honored Contributor

Re: Optimum cluster size for EVA ?

If you toss that XFC out at EXCEL.
Add a READ-IO column as READ - HIT
Sort By Descending READ-IO
Add a READ-% column as READ-IO/SUM(READ-IO)
Take a top-5
For day-time hour(s) and overall.
The you get the full picture below (probably heavily istorted.

Reduced to just block-size and read-percent:

Overall (64% of all read IOs)
Size % reads
1 19%
127 16%
2 15%
64 11%
32 3%

One-hour (87% of all read IOs)
Size % reads
1 39%
2 37%
64 5%
127 3%
47 3%

Reminder: IF the files are pre-allocated, THEN the cluster size matters absolutely ZERO, as far as OpenVMS IO performance is concerned. That is, as long as those IO have no (artifical)cluster alignment as suggested in the topic.

Now looking at the data, it seems that during the day, it matters even less than nothing, if that was possible.
The bulk of the IOs (almost 80%) is for a size where nothing matters: 1 or 2 blocks. There is no way to optimize those other than to make them go away: MORE XFC cache?

Looking overall there are those 127 block IOs to deal with. Probably not VMS Backup, as they seem to be cached. Possibly COPY or just application code using 32-bit RMS $READ with its maximum buffers size of 127.

That's an odd, prime value. No cluster size, other than a multiple of 127 will ever line up with that. That 127 will not line up with anything in the EVA, so the cluster size choice for that large component of all IO does not matter at all.

Finally there is a significant component of 16/32/64 block IOs. Possible simple miscelaneous stuff like LOG FILES. (SHOW RMS ... default set to 64?) Possibly not happening on the disks we are concerned with. Possible on files with fragmentation. Since nothing else matters, you might as well humor those and create a cluster size as a large multiple (127? :-) of 16.
Hence the 1024 I suggested in the first place.

Hope this helps some more,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

Size Read Hits Read IO Writes % reads
Total 411,167,290 405,510,091 5,657,199 17,023,438
1 135,516,967 134,427,979 1,088,988 3,242,916 19%
127 24,062,592 23,175,571 887,021 1,289,772 16%
2 119,641,504 118,808,372 833,132 893,298 15%
64 4,764,132 4,116,111 648,021 3,626,524 11%
32 1,567,160 1,385,615 181,545 875,731 3%

Size: Read Hits Read IO Writes IO % reads
Total 1,846,576 1,794,231 52,345 70,074
1 649147 628664 20483 13185 39%
2 582528 562910 19618 6728 37%
64 16986 14448 2538 14599 5%
127 77837 76366 1471 3020 3%
47 19162 17744 1418 3621 3%

Robert Brooks_1
Honored Contributor

Re: Optimum cluster size for EVA ?


Looking overall there are those 127 block IOs to deal with. Probably not VMS Backup, as they seem to be cached. Possibly COPY or just application code using 32-bit RMS $READ with its maximum buffers size of 127.

That's an odd, prime value. No cluster size, other than a multiple of 127 will ever line up with that. That 127 will not line up with anything in the EVA, so the cluster size choice for that large component of all IO does not matter at all.

--

Host-based volume shadowing used to do its copy and merge I/O's in 127-block increments.

I'm pretty sure the default was changed once the EVA alignment issue was discovered, but I'm not positive.

-- Rob
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

hi,

thanks a lot for your assistance in interpreting the XFC cache data

i thought now (a bit late)i could have used SDA extension if it helps, but it sounds sufficient with DCL command to do it. or ?

especially thanks to hein and rob for the last two very detailed replies based on attached XFC output.

i am thinking about your replies with my colleagues and will probably soon close this thread because i should conclude : we will not win anything thru a change of cluster size.

we had (possibly) thru the upgrade (application should be upgraded too, not only HW) to change that cluster size, and that was the reason of my thread.

we have (possibly) another customer with SCSI-3 drives in shadow-set where the same app upgrade should occure (no HW upgrade in that case). i will redo the same steps as you did to check what will be when we change cluster size there.

rob, i confirm : the application is an old $QIO based with the limit of "64K" IOs, where it looks like the actual limit is 127 blocks. i dont understand something here, 127 and not 128. should i read the code with attention to understand some subtilities, or can someone say immediately something about this point ? most IOs are either large upto that limit of "64K" or smal 1 or 2 blocks

anyway, thanks thanks and thanks again.

louis
Hein van den Heuvel
Honored Contributor
Solution

Re: Optimum cluster size for EVA ?

>> the application is an old $QIO based with the limit of "64K" IOs,

There is no 64 limit for QIOs in OpenVMS in general. There is for RMS $READ/$WRITE, and there was a time when the SCSI driver would split up IOs in 127 blocks chunks... but my memory is scetchy on that. Would have to research exact versions and stuff. will not.

>> i dont understand something here, 127 and not 128.

That't because there is 1 bit too few.

The limitation (in some placed, like RMS, but not in QIO) is not 64K but 16 bits to specify the size. Since 0 is in use, you can only count to 65535, so there is 1 byte too few to express a full 64K = 65536.
0xffff versus 0x10000 --> 16 bits versus 17

Compare with 1 digit, which can have 10 values (0 .. 9), but you can only count to 9!

Considering disk IO must have even byte counts, and really only make sense on 512 byte boundaries, 'they' should have 'given' you a free byte, making -1 = 64 k huh!?

They could have made QIOs to a block device take 512 byte blocks as arguments, but they kept they kept the interface consisten. Sizes are in bytes. Everywhere.
Offsets are in blocks though (VBN)!


Hein.


Robert Brooks_1
Honored Contributor

Re: Optimum cluster size for EVA ?

There is no 64 limit for QIOs in OpenVMS in general. There is for RMS $READ/$WRITE, and there was a time when the SCSI driver would split up IOs in 127 blocks chunks... but my memory is scetchy on that. Would have to research exact versions and stuff.

--

$ write sys$output f$getdvi( "$1$dka0", "device_max_io_size" )
131072

That item code returns the value in ucb$l_maxbcnt. A decimal value of 131072 is equal to %X20000 and is the upper limit of the SCSI disk driver. RMS and XFC may impose a lower maximum.

I think that item code was added for V8.2 and perhaps backported to V7.3-2

-- Rob
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

hi,

colleague confirms :

the application restriction is 64K for old reasons in the development, and i assume you quoted the correct one : 16 bits to specify the size, so "64K-1" bytes.

hein,

i am surprised when you wrote that this restriction does not apply to QIOs, since we are using QIO, not RMS => are you sure that it does not apply to old QIO (before fast IO) ???

we will not change this old legacy now, we cannot afford it (test such a change would not worth). and anyway, a cluster size looks not to bring something.

louis
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

many thanks for your help

louis
Hoff
Honored Contributor

Re: Optimum cluster size for EVA ?

The internal disk track and sector geometries are now typically synthetic, between the drive's own controller and the particular bus controller.

The cluster size recommendation was received directly from Brian Allison during various of our conversations, and Brian is one of the senior technical folks in OpenVMS I/O Engineering, and charged with getting bits onto and off of disks. Quickly.

Minimally 16, or 32, or better, for the cluster size.

He had indicated it was derived from several issues and considerations, not the least of which were "knees" in the controller performance curves when transfer sizes and transfer rates were compared.

Not all controllers had these "knees", but it was likely enough that your disks would end up connected to one at some point in its existence.

I had gathered that aligned transfers allowed transfer optimization(s) in some of the newer controllers, but did not confirm that. Alpha and particularly Itanium processors are very sensitive to data alignment, so it didn't surprise me that the controllers are (also?) sprouting similar behaviors.


Jon Pinkley
Honored Contributor

Re: Optimum cluster size for EVA ?

I will agree that larger transfers are generally better than small ones, but there is no direct relationship between clustersize and transfer size. There can be secondary effects, for example, clustersize can affect RMS indexed file bucket sizes, and bucket size will affect the QIO transfer size requested by RMS. Also a disk with a small clustersize is likely to get fragmented more quickly, and therefore lead to more split I/O transfers, but the number of blocks transferred by a VMS QIO is not affected by the cluster size; i.e. a QIO to a contiguous file on a disk with a clustersize of 1 will result in the same transfer size as the same QIO to a disk with a cluster size of 16380.

I agree with what Hein wrote in his note from Mar 26, 2008 12:56:49 GMT "Clustersize defines first-block-in-file LBN alignment. No more, no less."

Unless given more than an appeal to authority, I am not convinced that clustersize alone makes much difference at all, at least on a freshly initialized disk. There are good reasons for large cluster sizes, but there are also good reasons for small ones. The blanket statement that the clustersize should be a minimum of 16 for even better 32 should be qualified. As with most things, "it depends" applies.

Jon
it depends
Guinaudeau
Frequent Advisor

Re: Optimum cluster size for EVA ?

hi,

i am surprised by that "system" :

- the ITRC forum itself

- the VMS community

i do know some of this, and appreciate, in general, and ever already positive surprised.

but today a bit more surprised :

one can add after thread had been closed add-ons. they are eventually worth. although i only overview today these, but hoff especially since he was the speaker two years who brings me to look at the issue.

=> thanks for those add-ons. i ack here the great force of such a community. yes it is.