Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Disk Initialization Parameters for good performance

 
sartur
Advisor

Disk Initialization Parameters for good performance

We have a OpenVMS CLUSTER ( V8.3 ) with disk storage in a EVA8000.
A disk RMS sometime gives the message SYSTEM-F-HEADERFULL
It's a 344GB disk (vraid 1 in EVA) having 451 directories and 736,000 files.
We proceeded to create a new disk and perform disk to disk backup
The new disc was initialized with the following characteristics

initialize /cluster_size=18/directories=1000/header=1500000/nohighwater/maximun_files=1500000/index=end $1$dga75: PAN_RMS2

Initialization parameters are adequate for good performance?
18 REPLIES 18
Hein van den Heuvel
Honored Contributor

Re: Disk Initialization Parameters for good performance


What to you typically do with those files?
Is there a DB product involved? Simple sequential or indexed?

Are they all in the 1/2 MB (1000 block) space) or some large and many small?

I like to make my clustersizes a power of 2 or a multiple of 16.
That improves the odds that rms (indexed file) buckets are aligned with XFC cache lines, and it some flavors of the EVA firmware liked it. But if I recall correctly that was mostly for Raid-5 full-stripe writes which are not in play here. That, and it makes fo easy math! :-)

/DIRETORIES is irrelevant
/HEADER as shown waste 1/2 GB
/INDEX=END 'suggests' maximum seeks for maximum time. The EVA will outsmart you by actually giving it chucks early on the disks, because INDEXF.SYS is actaull initialized and touched earlier. Still, I prefer /INDEX=BEG or the default of middle for historical reasons.
/NOHIGH if the environment/security can stand it is best for speed during certain accessed/creates.
/MAXIMUM should be big enough

You may want to add

/EXTEN=1024 for nice pre-allocation during file create/extend

/LIMIT for expansion

And you may want to start put with /STRUCT=5 but that, not /LIMIT, has direct performance implications

hth,
Hein



Jon Pinkley
Honored Contributor

Re: Disk Initialization Parameters for good performance

sartur,

You have a 344 GB disk and you are pre-allocating all the headers you will ever allow? My advice is to let maximum_files be as high as possible, it doesn't cost much (1/4096th of a block for each possible file, it is controlling a bitmap that represents file headers). Not having it large enough means you will have to reinitialize the disk to extend the bitmap.

I would recommend either cluster of 16 instead of 18 (but this is because I like cluster sizes that are a power of 2). VMS folklore says it will be more efficient with 16 block clusters, but I have never seen any convincing evidence. The EVA cache likes 8KB (16 block) transfers, but the cluster size has only indirect affects on I/O transfer sized. But a cluster size of 16 won't be worse than 18, unless you have most of your files that are a multiple of 18 blocks in length.

I would recommend

/limit ! allow expansion and maximize /maximum_files (this is an EVA, and it is easy to expand a vdisk if you need to)
/maximum_files ! I would use /limit and leave this qualifier off. Then it will be 16 million.
/cluster_size=16 ! this will be the default if you use /limit
/directories=16000 ! this just pre-allocates some space (1000 blocks) to [000000]000000.dir if you want less, use something less. (n/16 will be # of blocks) Nearly irrelavent.

Other things can be controlled at mount time (like /window size and /extension). Be aware that if you have poorly behaved programs that reopen files for every write, having a large /extension can have a detrimental performance effect, as the file gets extended and possibly truncated when closed.

Jon
it depends
John Gillings
Honored Contributor

Re: Disk Initialization Parameters for good performance

sartur,

I'd go with what Hein suggested for qualifiers, especially avoid /INDEX=END. Without knowing the average size of files, it's difficult to say what the cluster size should be. If the disk is nearly full, the files would average around 500KB, so the cluster size should probably be more than 18, maybe 128?. That may be too extreme, if there are many smaller files, but, in general, on a disk that size, I'd go for something much bigger than 18.

I'm a bit more interested in your HEADERFULL - I can't see how you can get an INDEXF.SYS HEADERFULL on the disk you describe. If you still have the faulty disk, could you post the output of:

$ SHOW DEVICE/FULL disk
$ DUMP/HEADER/BLOCK=COUNT:0 disk:[000000]INDEXF.SYS

It might also help if you describe how the files are created - maybe there a a large number of files being created in parallel, with small cluster and extend sizes, then gradually deleted in random order, leaving a highly fragmented disk?
A crucible of informative mistakes
sartur
Advisor

Re: Disk Initialization Parameters for good performance

John

Currently the disk is not giving HEADERFULL, that was months ago, which is why we proceeded to initialize it this way.

My question now is whether the initialization in this manner could be negatively affecting the performance, because currently we are having some slowness.

SHOW DEVICE/FULL $1$dga6

Disk $1$DGA6: (PANA00), device type HSV210, is online, mounted, file-oriented
device, shareable, available to cluster, device has multiple I/O paths,
error logging is enabled.

Error count 0 Operations completed 7472577
Owner process "" Owner UIC [SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 323 Default buffer size 512
Current preferred CPU Id 3 Fastpath 1
WWID 01000010:6005-08B4-0009-2478-0000-A000-0017-0000
Total blocks 671088640 Sectors per track 128
Total cylinders 40960 Tracks per cylinder 128
Logical Volume Size 671088640 Expansion Size Limit 671440896
Allocation class 1

Volume label "PAN_RMS2" Relative volume number 0
Cluster size 18 Transaction count 320
Free blocks 275076432 Maximum files allowed 1500005
Extend quantity 5 Mount count 2
Mount status System Cache name "_$1$DGA50:XQPCACHE"
Extent cache size 64 Maximum blocks in extent cache 27507643
File ID cache size 64 Blocks in extent cache 1343574
Quota cache size 0 Maximum buffers in FCP cache 8346
Volume owner UIC [SYSTEM] Vol Prot S:RWCD,O:RWCD,G:RWCD,W:RWCD

Volume Status: ODS-2, subject to mount verification, write-back caching
enabled.
Volume is also mounted on PANA06.

I/O paths to device 4

Path PGA0.5000-1FE1-5004-238C (PANA00), primary
Error count 0 Operations completed 31
Last switched to time: Never Count 0
Last switched from time: 23-MAY-2010 09:00:41.04

Path PGA0.5000-1FE1-5004-2388 (PANA00)
Error count 0 Operations completed 31
Last switched to time: 23-MAY-2010 09:00:41.04 Count 1
Last switched from time: 23-MAY-2010 09:00:53.17

Path PGB0.5000-1FE1-5004-2389 (PANA00)
Error count 0 Operations completed 31
Last switched to time: Never Count 0
Last switched from time: Never

Path PGB0.5000-1FE1-5004-238D (PANA00), current
Error count 0 Operations completed 7472484
Last switched to time: 23-MAY-2010 09:00:53.17 Count 1
Last switched from time: Never

PANA00=>

DUMP/HEADER/BLOCK=COUNT:0 $1$dga6:[000000]INDEXF.SYS









Dump of file $1$DGA6:[000000]INDEXF.SYS;1 on 24-MAY-2010 16:40:09.38
File ID (1,1,0) End of file block 788275 / Allocated 1500444

File Header

Header area
Identification area offset: 40
Map area offset: 100
Access control area offset: 255
Reserved area offset: 255
Extension segment number: 0
Structure level and version: 2, 1
File identification: (1,1,0)
Extension file identification: (0,0,0)
VAX-11 RMS attributes
Record type: Fixed
File organization: Sequential
Record attributes:
Record size: 512
Highest block: 1500444
End of file block: 788276
End of file byte: 0
Bucket size: 0
Fixed control area size: 0
Maximum record size: 512
Default extension size: 0
Global buffer count: 0
Directory version limit: 0
File characteristics: Contiguous best try
Caching attribute: Writethrough
Map area words in use: 11
Access mode: 0
File owner UIC: [SYSTEM]
File protection: S:RWE, O:RWE, G:RWE, W:RWE
Back link file identification: (4,4,0)
Journal control flags:
Active recovery units: None
File entry linkcount: 0
Highest block written: 1500444
Client attributes: None

Identification area
File name: INDEXF.SYS;1
Revision number: 4580
Creation date: 19-AUG-2004 19:44:08.00
Revision date: 24-MAY-2010 10:14:56.98
Expiration date:
Backup date: 15-MAY-2010 04:18:50.51

Map area
Retrieval pointers
Count: 36 LBN: 0
Count: 18 LBN: 1026
Count: 18 LBN: 669579138
Count: 1500372 LBN: 669588264

Checksum: 5697
PANA00=>
Jon Pinkley
Honored Contributor

Re: Disk Initialization Parameters for good performance

sartur,

In short, I don't see anything in your init qualifiers themselves would be ausing "poor" performance.

What is the indication of poor performance? Is it getting worse slowly or did something change abruptly?

Do you have DFU installed on your system?

If so, can you do

$ def/job dfu$nosmg 1
$ dfu report $1$dga75: /graph

Also, if you have T4 loaded and collecting info, you may be able to see trends.

what does "$ monitor fcp,io" show for split io and window turn rate? Non-zero values indicate fragmention that is affecting performance.

What about file open rate?

Disk initialization parameters are pretty low on the list of performance affecting factors (in my opinion).

Also, I don't think that /index=end is going to make a noticable difference compared to /index=beg or the default /index=middle.
This is because you are using an EVA, which is going to spread the i/o over all the disks in a disk group, and you really have no control over what portion of the disk a particular LBN will land. Disk allocation on the EVA is similar to virtual memory, the disk allocation is split into psegs which correspond to "pages", and you don't have control of what psegs are allocated to your vdisk.

Jon
it depends
Jon Pinkley
Honored Contributor

Re: Disk Initialization Parameters for good performance

I see your last comment was using $1$DGA6 instead of $1$dga75

So for DFU report use whatever disk you are interested in.

DFU report will give some fragmentation statistics, but they are not as extensive as what DFG (DFO defrag) will display. You can install and use the reporting features of DFG without having a license PAK istalled. It is worth installing DFG if you like the graphs it can provide.

Jon
it depends
Jon Pinkley
Honored Contributor

Re: Disk Initialization Parameters for good performance

sartur,

Is the "slowness" only on the $1$DGA6 device or is everything slow?

The slowness could be related to other activity on the EVA that is using the same disk group.

Also, since this is a Cluster, what does monitor cluster show? If there is high locking activity, you may want to use monitor DLOCK, as there may be a file that is being modified on multiple nodes and causing remote locking (indicated by incoming and outgoing in the monitor dlock output).

In short, you need to determine what the real cause of the slowness is before you can determine how to fix it.

Jon
it depends
John Gillings
Honored Contributor

Re: Disk Initialization Parameters for good performance

sartur,
Your output is $1$DGA6, is that the disk you (apparently) got HEADERFULL from?

There's definitely nothing wrong with INDEXF.SYS on that disk:

Map area words in use: 11
...
Map area
Retrieval pointers
Count: 36 LBN: 0
Count: 18 LBN: 1026
Count: 18 LBN: 669579138
Count: 1500372 LBN: 669588264

This is as contiguous as it can ever be. It has not, and I'd guess never will hit HEADERFULL.

"because currently we are having some slowness."

Relative to what? Can you describe the operation you believe to be "slow" and what you're comparing it with?

Your cluster size is a bit small, and in theory, /INDEX=END is not a great idea, but I'd have thought you'd have to be hammering the disk VERY hard to notice any difference.
A crucible of informative mistakes
John Gillings
Honored Contributor

Re: Disk Initialization Parameters for good performance

Oh, and just a general point about the assumption implied in your title:

"Disk Initialization Parameters for good performance"

If there were a single set of qualifiers (parameters) which always gave good performance, they would be hard coded, not variable.

The best settings for your disk depend on how the disk is used. For example, a disk which contains a single large file would need very different settings from one containing large numbers of small files. You also need to consider if files are created once and stay forever, or if there's a high turnover, if all the files are about the same size, or if there's a mix of very large and very small files, are they written one at a time, or many in parallel, extended over a long period of time, if they're single spindles vs RAID sets, direct attach vs SAN etc... Lots of variables. You've given us some information, but not much.

Rather than present us with a possible cause and ask if it might be responsible for a nebulous symptom, please give us more hard detail of the symptom you're interested in and ask what might be possible causes.
A crucible of informative mistakes
sartur
Advisor

Re: Disk Initialization Parameters for good performance

Pinkley

This disk was giving HEADERFULL a couple of months ago and was highly fragmented above 90% (DFG Defrag), so I proceeded to make a backup (disk to disk), creating a new one ($1$ DGA75) with the parameters that referred. Then I presented this new disk as the original and removing the fragmented. We currently no have HEADEFULL symptom, but again had to defragment using the same method ( for ist large size and time to defragment ) This disk along with others, who have also defragmented, they are processed and updated in batch mode, and the ejecution batch time have risen excessively. My question or doubt come from knowing whether any initialization parameters used may be impacting these processing times.
Shriniketan Bhagwat
Trusted Contributor

Re: Disk Initialization Parameters for good performance

Sartur,

You need to initialize the disk based on your requirement.
Below are few information about initializing the disk.

The file system allocates/deallocates disk space in multiples of the Cluster Size. If a disk has a Cluster Size of 18 (as in your case), no file is smaller than 18 blocks. A large cluster size helps reduce file fragmentation, but may be wasteful of disk space. / MAXIMUM_FILES qualifier defines the maximum number of files that the volume may hold. It actually is used to create a bitmap of the size given internally to INDEXF.SYS. Each bit in this bitmap maps to a file header in INDEXF.SYS. If the bit is 0 (zero) the file header is available and may be used to create a new file. If the bit is 1 (one) the header is in use. This bitmap decreases the time needed to allocate an unused file header; by checking the state of individual bits in the bitmap, a sequential search of INDEXF.SYS for a free header is avoided. The number of bits in this bitmap determines the maximum number of file headers which the disk could ever have. It does not determine the number of headers the disk has. The actual number of headers is determined by /HEADERS qualifier and by the expansion of INDEXF.SYS.

The default value for MAXIMUM_FILES is usually sufficient. It is determined from the size of the disk using the following algorithm:

MAXIMUM_FILES = (((DISK_SIZE_IN_BLOCKS + 4095)/4096) + 254)/255

If the disk is going to be used for many small files, this qualifier can be use to increase the value. Specifying a large value for MAXIMUM_FILES has very little impact on disk space.

When INDEXF.SYS is created on the freshly initialized volume, only some file headers are actually created. The default number created is 16. The value specified for /HEADER qualifier changes that number. The default value of 16 file headers is enough for about 10 files to be created, which is generally too small. Once these 16 file headers are in use, the next file create or expand also must expand INDEXF.SYS to make room for more file headers. INDEXF.SYS is limited to about 50 extents (the actual value is between 28 and 77). Once these 50 expansions are done, any further attempts to expand INDEXF.SYS (i.e. another file create) fails with the "SYSTEM-F-HEADERFULL, file header is full" error. The number of headers specified should be based on an estimate of the number of files and directories that are to be on the disk. Each header requires on block of disk space (512 bytes). Specifying too many headers wastes disk space. Specifying too few headers causes INDEXF.SYS expansion and fragmentation. This can impact performance and may lead to the HEADERFULL error.

Regards,
Ketan
Shriniketan Bhagwat
Trusted Contributor

Re: Disk Initialization Parameters for good performance

Sartur,

HEADERFULL is a typical case, when the disk is heavily fragmented and files are having many extents. If the INDEXF.SYS, can no longer be extended, as a temporarily workaround, you can delete some unwanted or temporary files from the volume. This will free the headers associated with the deleted files for reuse, eliminating the immediate need to extend the index file. You can also use BACKUP to save/restore the volume or defrag the volume to get rid of the HEADERFULL.

Regards,
Ketan
Hein van den Heuvel
Honored Contributor

Re: Disk Initialization Parameters for good performance

How much space are you willing to 'waste' in cluster size roundup? You have a lot of files. So it could get costly.
There is 40% free space right now right?
Let's say you are willing to waste 5% in roundup.
That would be 671088640 / ( 20 * 736000 = 45 blocks per file.
To me that would suggest a cluster size of at least 32, maybe 64.

>> My question now is whether the initialization in this manner could be negatively affecting the performance, because currently we are having some slowness.

That's a big leap. I suspect you know a lot more to make you think in this direction. Please help us help you and explain some of your inputs/thinking.
High window turn rate?
Ugly MONI FCP, FILE pictures?

>> and the ejecution batch time have risen excessively.

Let us hear some more... Just execution time increase with similar CPU time and Direct IO counts? That could be slower IOs... or less efficient XFC cache. But maybe just something completely different changed over time. Like more XFC memory pressure and less files cached, or 'duplicate key chains' building in RMS indexed files, or...

>> highly fragmented above 90%

I like DFU to study that.
Commands like $ DFU REPORT DISK and $ DFU SEARCH/FRAG=MINI=500

What are you doing to minimize fragmentation at the source, rather than try to fight it post fact? File pre-allocation? SET RMS/EXTEN=XXX?

You may want to split the volume such as to have one for smaller growing and shrinking files where those can fight it out amongst themselves and a second part to let the large (RMS indexed) file grow in piece with large cluster size and good contiguous-bets-try allocation and extends. The second drive would never needed to be defragged. The first would be hopeless and can just be left, but if you want, a defragger will have an easy time with many smaller, un-opened files.

The volume is labeled 'RMS'. Does it hold large, growing files?

I like manually pre-extending them when they are almost out of the allocated space. [ I have some tools to quickly show space in use and free per area, and an other tool to extend a specific area by a specific number of blocks (no 65K limit... I've pre-extended files in production from 40GB to 60GB ) ]

Good luck!
Regards,
Hein van den Heuvel
HvdH Performance Consulting
Steve Reece_3
Trusted Contributor

Re: Disk Initialization Parameters for good performance

Hi Sartur,

You don't say how your performance is bad now, only that you had the HEADERFULL error.

Is the performance bad for reads or just writes? If it's just writes, are you sure that the cache batteries in your SAN array are still functional and holding charge?

Steve
JBF
Occasional Visitor

Re: Disk Initialization Parameters for good performance

Greetings, I would like to differ with the information provided here.

I've done a LOT of testing with various arrays. I use the DISK_BLOCK freeware tool to place differing loads on arrays. What I tend to find is that with most current SAN storage arrays (MSA arrays, EVA Storage Arrays, and XP disk arrays) it is better to use a cluster size that is a power of 2. So, 2, 4, 8, 16, 32, 64, 128 and so on.

Essentially by using a power of two as the cluster size, the OpenVMS disk cluster tends to be better aligned with the cache segments within the SAN storage array.

Additionally, if the volume will have a large I/O request rate, it is often better to use a large cluster size. Though this may "waste" space, as the size of physical drives increases and the cost per MB decreases, we can start to focus more on the performance gain over the cost. We need to continue to balance them, of course. But the cost has less of an impact in the equation.

Let's use two examples. If we have a user volume with a lot of small files and reports, then a lower cluster size (16) might make more sense, since the chance of wasting space is greater. But if we have a data volume with a lot of RMS files, we might want to consider a much larger cluster size (64 or 128) to help decrease the I/O request rate against this volume.

So, what does a larger cluster size help improve performance? Well, most of the current arrays are designed to handle a moderate number of large I/O requests. By moving the cluster size to a larger size we tend to influence things such as the size of extents that are allocated, the default bucket sizes of RMS files. And so on.

And of course, we hit the point of diminishing returns quite quickly. But I would note that 64 is a much better number than 18. OpenVMS will often do one I/O request instead of 3.5 (plus change) I/o requests to move the same amount of data.

How much does all this matter? Well, in my testing, I tend to see about a 5 to 10 percent improvement in throughput by increasing the cluster size.

And of course it depends on the workload. But a rule of thumb would be to use a power of two for the cluster size and try to balance the need to decrease I/O requests against the cost of the "wasted" space.

Also, the impact does depend on the technology of the SAN storage array. EVA storage arrays can sustain more I/O requests on a single host port and the XP disk array. But the XP disk array can handle FAR larger I/O requests (based on the size of the data transferred). So, the answer is very much array dependent.

So, while I agree with the answers in general, since OpenVMS no longer manages the storage and needs to depend on the SAN storage controller, an awareness of what works best with that environment is needed to provide the correct size of the cluster for a specific volume.

Hope that helps.
JBF
Occasional Visitor

Re: Disk Initialization Parameters for good performance

Folks, I would like to add one point that I did not make very clear (thank you Hein for making certain I did include this) ...

Using a cluster size that is a power of two does not directly impact the I/O transfer size. It only indirectly impacts the I/O transfer size for each I/O request.

What are some of the ways it impacts the I/O transfer size? It will directly impact the default RMS bucketsize. That will help RMS files, especially indexed files. But it will not impact much else. It will indirectly impact the fragmentation. Larger extents can help reduce fragmentation, though on a volume with MANY small files it will tend to waste space.

But Hein was correct to point out that it only indirectly impacts the I/O transfer size. If you application does single block I/O requests, that is exactly what OpenVMS will do.

As in all things OpenVMS, your mileage will vary. I recommend testing this in your own environment. And of course, that can be easier said than done.
Hoff
Honored Contributor

Re: Disk Initialization Parameters for good performance

If this is disk access, I'd typically suggest an SSD as those will completely blow away any existing HDD storage; if you can get one of those connected, you'll find massively higher performance. The existing SCSI and FC HDD speeds are positively glacial in comparison.

Experience with previous similar questions leads me to be extremely skeptical that the disk volume structures have anything to do with the underlying performance problems.

Until statistics are collected and benchmarks are established, these sorts of "fix that" or "add an SSD" discussions are typically futile. The actual performance-limiting factors can lurk in most any spot within the typical complex application or operating system environment. (This is also why RT-ish OSs are starting to appear again in widespread use, too.)

With OpenVMS, you should have some T4 data.

And if you can't spin a few extra disks on your server here, go address that. A typical laptop can have an SSD in this capacity range, or can have HDD capacities well beyond 344 GB.

Why do I mention evaluating the configuration? Disk hardware vibration in storage array can be a massive performance factor; even moderate vibration that really trashes access and transfer speeds across a disk shelf. I've seen a full failure trash a full shelf, and retries on more subtle vibration can still slam throughput.

Measure.

Don't guess.

(Or guess, but then measure. And then compare.)

Go get that T4 data.
sartur
Advisor

Re: Disk Initialization Parameters for good performance

close thread