Online Expert Day - HPE Data Storage - Live Now
April 24/25 - Online Expert Day - HPE Data Storage - Live Now
Read more
Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

What determines Avg read i/o size

Mike R Smith
Frequent Advisor

What determines Avg read i/o size

I am working through a backup performance issue. We have SAN storage snapshot devices on production giving slow backup response times. We set up a similar copy on the development box to test. The T4 data shows avg read i/o size on the dev box of 62 versus the production server showing a size of 125. The backup test command is the same. I am trying to determine why the differenc in avg read i/o size. Thanks for any help.
8 REPLIES
Hoff
Honored Contributor

Re: What determines Avg read i/o size

This is a how-high-is-up discussion, unfortunately.

VMS BACKUP? Or one of the various SAN tools?

What's the BACKUP command?

BACKUP itself has sensitivities to quotas, as well as means to throttle its activities. (There are days I'd like to throttle BACKUP myself, but that's another discussion.) Check for differences here. (I have the current BACKUP quota recommendations posted at the HoffmanLabs web site.)

With VMS BACKUP, disk cluster size, and application and system default extent size, and the results of multiple and incrementally-extended log files and such, all mixed together to produce disk fragmentation and I/O bottlenecks.

Have a look at the file size distribution on the disks you're comparing, and at the relative degrees of disk (and file) fragmentation. The Freeware DFU tool can get you some of that, as can the DFO tool's analyzer features, as cam various other tools.

VMS boxes used as application development boxes can contain frequently-revised and often tending toward small files and incrementally-extended log files and parallel users all churning away on shared storage, and the ensuing fragmentation.

Writing as somebody that does development as well as operations and tuning work, developers aren't the folks you want to share a server with, and are most definitely not part of a configuration you want to repurpose as a comparative benchmark.
Hein van den Heuvel
Honored Contributor

Re: What determines Avg read i/o size

Same OpenVMS versions?
Does SHOW RMS show the same settings?

Is this backup as a copy tool, or backup to a save set? Is that average IO size from the input device, or to the output device?
When writting to a saveset backup backup uses RMS $WRITE and I think it listens to SET RMS

Highwater marking seting the same on the output devices (shouldn't matter for the sequential writes, but just in case...)

Fragmentation similar?

hth,
Hein

Mike R Smith
Frequent Advisor

Re: What determines Avg read i/o size

Hoff, thanks for the reply. I am using the VMS backup command on a static disk and on a snapshot of that disk, both of which are backed up to the null device. It is about as vanilla as you can get.

$ back/image device: /ignore=inter/nocrc -
/block=65535/io_load=8 nla0:nulltest. -
/save/noassist

I will have to look at the other items you indicated but information I have from HP is that the read io size is derived from the /block setting. Using the same block size setting, I see a different read io size.
I am trying to understand why this is happening.


Mike R Smith
Frequent Advisor

Re: What determines Avg read i/o size

Hein, thanks for replying. VMS version is 8.3 in both places. This is a purely read speed test so no writing taking place.

Show RMS settings matched on both sides.

Devices initialized with /nohigh qualifier
Hoff
Honored Contributor

Re: What determines Avg read i/o size

One of us appears to need to adjust perspective on what a "vanilla" BACKUP command might look like. :-)

The /BLOCK size sets the saveset and the transfer buffer size. BACKUP still has to fill that buffer.

Check your disk and your file fragmentation. (And if I was called in for this, I'd also have a look at fragmentation within the RMS file structures, as those can get clogged, too. But that's not a contributor here, though.)

For grins, I might well create a BACKUP /IMAGE copy of the input disks (which inherently defragments it) and then compare the original and the copy in another round of NLA0: output tests.

You do recognize and intend that the BACKUP saveset produced by the cited BACKUP command is permitted to contain silent data corruptions, which is a reasonable choice given that this is for testing and targets the NLA0: null device but ill-suited to production, of course.
Hein van den Heuvel
Honored Contributor

Re: What determines Avg read i/o size

Hmmm,

What it the problem that you are really trying to solve? Do you backup to the null device in production, or do you believe that backup up to the null device is a faithful reproduction of behaviour observed in production?
Please be aware that the NL: device is relatively SLOW for many usages because RMS will decide to use RECORD mode, and issue an IO per record instead of block mode.
This is very visible and measurable with a simple COPY of a large file.
NL: output is slower than most disks.

Anyway...

>>>This is a purely read speed test so no writing taking place.

And the T4 observations were based on that?

>>> Show RMS settings matched on both sides.

Thanks for checking because they DO matter, notably on output.
I just tested with the LD TRACE option:


$ ld create/size=1000000 dka0:[temp]tmp.dsk
$ ld create lda1
$ ld connect dka0:[temp]tmp.dsk lda1
$ ld trace lda1
$ init lda1 /clus=1024/ind=beg/nohigh/head=1000/max=1000 temp
$ mount lda1 temp temp
%MOUNT-I-MOUNTED, TEMP mounted on _BUNDY$LDA1:
$ set rms /block=50 /sys
$ back sys$help: temp:[000000]help.bck/save
$ ld show/trace lda1
:
End Time Elaps Pid Lbn Bytes Iosb Function
---------------------------------------------------------------------

13:20:11.234475 00.000136 0000009A 53842 25600 NORMAL WRITEPBLK|EXFUNC
13:20:11.236546 00.000136 0000009A 53892 25600 NORMAL WRITEPBLK|EXFUNC
13:20:11.236766 00.000120 0000009A 53942 25600 NORMAL WRITEPBLK|EXFUNC
13:20:11.239003 00.000111 0000009A 53992 25600 NORMAL WRITEPBLK|EXFUNC
13:20:11.241107 00.000131 0000009A 54042 25600 NORMAL WRITEPBLK|EXFUNC

$ set rms/blo=127/sys
$ back sys$help: temp:[000000]help.bck/save
$ ld show/trace lda1
:
13:20:39.744468 00.001021 0000009A 158050 65024 NORMAL WRITEPBLK|EXFUNC
13:20:39.749063 00.000396 0000009A 158177 65024 NORMAL WRITEPBLK|EXFUNC
13:20:39.753607 00.000152 0000009A 158304 65024 NORMAL WRITEPBLK|EXFUNC
13:20:39.758493 00.000388 0000009A 158431 65024 NORMAL WRITEPBLK|EXFUNC
13:20:39.758555 00.000247 0000009A 158558 :
$ write sys$output 127*512
65024


If you really want to figure this out, then I recommend using LD TRACE on your real input disk, not a container as in my example.

>> Devices initialized with /nohigh qualifier

Thanks for checking. Mostly important for output, but can perhaps surprisingly effect read also, when reading after the current HWM for a file. But that would happen just once pretty much.

Guess the next thing to also check is Fragmentation.

Good luck,
Hein.
Mike R Smith
Frequent Advisor

Re: What determines Avg read i/o size

The real issue I am trying to solve is that when we migrated our storage from one San to the other, backup times increased anywhere from 40 to 100%. T4 data clearly showed high i/o read times on the new storage. We went from about 3ms to about 10ms. In working with the storage vendor we have been trying to perform testing in our non production environment using one of the disks that we saw double in backup time.

One other piece of information that i have just discovered. I was able to review read i/o size for a device on our old storage that is current on the same port as this test device. The read i/o size using the same backup command is 125 for the device on the old storage vs 62 for the device on the new. I am also trying to determine is something is set differently on the storage port side.
Jur van der Burg
Respected Contributor

Re: What determines Avg read i/o size

One word of caution with LD TRACE: make sure you run at least LD V9.5 since trace may corrupt floating point registers when trace is in use. If you don't use LD TRACE then there's no issue.

Jur (lddriver author)