Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Relationship of File Headers to Maximum Files Allowed

SOLVED
Go to solution

Relationship of File Headers to Maximum Files Allowed

What is the relationship of file headers to maximum files allowed.

I recently migrated from one disk to another. I did not do an image backup, I only moved the data I needed to. Being cognizant of the number of small files I was moving, I initialized the new disk with a small cluster size. What I see is that I have a very unfragmented volume. The maximum number of files is greatly larger than the number of free headers and I don't understand why. I know I can use the /MAXIMUM_FILES parameter when initializing the volume but that doesn't appear to help me increase the number of file headers. What am I missing?

I used the following command to initialize the disk.

INIT/NOHIGH/STRUCT=5/LIMIT $1$DGA91: ECP_DISK3

***** Volume info for ODS5 volume $1$DGA91: (from HOME block) *****
Volume name : ECP_DISK3
Volume owner :
Volume set name :
Highwater mark. / Erase on del. : No / No
Cluster size : 3
Maximum # files : 16711679
Header count : 1118010
First header VBN : 4093
Free headers : 11221

***** File Statistics (from INDEXF.SYS) *****
INDEXF.SYS fragments/ map_in_use : 4 /11 words ( 7% used)
Total files (ODS2 / ODS5) : 10 / 1106639
Empty files : 11195
Files with allocation : 1095454
Files with extension headers : 37
Files marked for delete : 0
Directory files : 185102
Contiguous files : 1085424
Total used/ allocated size : 113084349 /115295802 blocks
Total headers/ fragments : 1106696 /1129806
Average fragments per file : 1.031
File fragmentation index : 0.145 (excellent)
Average size per fragment : 102 blocks
Most fragmented file :
$1$DGA91:[ECP.LIVE.277]HP_9E049_NONSTDU.RSP;136 ( 25969/25971 blocks; 278 fr
agments)

***** Free space statistics (from BITMAP.SYS) *****
Total blocks on disk : 419430400
Total free blocks : 304134600
Percentage free (rounded) : 72
Total free extents : 217
Largest free extent (blocks) : 207548418 at LBN: 211881981
Average extent size (blocks) : 1401541
Free space fragmentation index : 0.000 (excellent)
20 REPLIES
Jess Goodman
Esteemed Contributor

Re: Relationship of File Headers to Maximum Files Allowed

The /MAXIMUM file qualifier sets the size of the index file bitmap. This is an area that comes before the first file header in INDEXF.SYS and each bit represents whether a file header is is use or not. This area can not be expanded.

The /HEADERS qualifier specifies how many flie headers are initially allocated in INDEXF.SYS. If /HEADERS is less than /MAXIMUM, as it usually is, then more file headers will be allocated later if required by file growth, assuming INDEXF.SYS can be extended.

(INDEXF.SYS can only have only one file header itself so there is a limit to how many times it can be extended. Using a large initial /HEADER count reduces the chance of this.)
I have one, but it's personal.
John Gillings
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

The header count is the number of headers currently existing. Maximum files allowed is the maximum it can ever grow to.

Look up in the internals how INDEXF.SYS works. In a nutshell, it contains a header bit map, (sized by /MAXIMUM_FILES), an extent bitmap (worked out from the size of the disk and the cluster size), and the headers themselves.

The header bitmap is fixed forever when the disk is initialized. The extent bitmap was fixed, but it can now (>= V7.3-2) be expanded with the dynamic volume enhancement. The number of headers initially allocated is controlled by /HEADERS. It can expand as required. In V5.5 days this was a problem because the initial allocation was pitiful (16!) and the expansion algorithm was poor, often leading to the header of INDEXF.SYS filling up. These days expansion is handled quite well so you have to try very hard to find a pathological allocation pattern to induce a HEADERFULL condition on the disk.

If you really want to preallocate more headers, use INIT/HEADERS (see help for details). Most of the time you can trust the defaults.

A crucible of informative mistakes
Robert Gezelter
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

First, allow me to extend a welcome to the ITRC OpenVMS Forum!

It would appear that the volume was initialized allowing a maximum of 16,711,679 files, with the index file containing 1,118,010 headers, of which 11,221 are presently not in use.

As has been noted, the MAXIMUM_FILES parameter governs the size of the bitmap, the headers are actual blocks in the index file.

The bitmaps used for volume management are fixed at initialization (as opposed to the overall size of the index file, which can grow within the limits of its single file header). The dynamic volume expansion support (in 7.3-2 et seq.) uses a pre-allocated large bitmap, in concert with the "fence" at the end of the volume, to allow the volume to be "extended" on the fly.

The dynamic extension capability, when used with host based volume shadowing, as some interesting implications, as I pointed out in " Migrating OpenVMS Storage Environments without Interruption or Disruption" at the 2007 HP Technology Symposium in Las Vegas (slides available at: http://www.rlgsc.com/hptechnologyforum/2007/1512.html ).

- Bob Gezelter, http://www.rlgsc.com

Re: Relationship of File Headers to Maximum Files Allowed

Thanks for the information guys. I'll pull some of your information together here for one final question.


If /HEADERS is less than /MAXIMUM, as it usually is, then more file headers will be allocated later if required by file growth, assuming INDEXF.SYS can be extended.

The header bitmap is fixed forever when the disk is initialized. The extent bitmap was fixed, but it can now (>= V7.3-2) be expanded with the dynamic volume enhancement.

The dynamic volume expansion support (in 7.3-2 et seq.) uses a pre-allocated large bitmap, in concert with the "fence" at the end of the volume, to allow the volume to be "extended" on the fly.

These days expansion is handled quite well so you have to try very hard to find a pathological allocation pattern to induce a HEADERFULL condition on the disk.


OK, so it should expand on it's own now (I'm running 8.3) and create more available headers as needed?

BTW, I'm watching this closely because I have, on two occassions, run into a HEADERFULL condition on the disk I migrated this data from and I don't want to run into that again.

Which leads me to my final question. With my bad experience with the prior disk headerfull issue, I'm not confident that the volume will expand as necessary. That coupled with the fact that the prior volume had the same max_files yet 4 times the number of headers, do you agree that I should reinit this disk with the headers parameter rather than counting on dynamic expansion? BTW, I was down to less than 300 free headers yesterday before I deleted files to get it up to 11k free. How close to zero should it get before expanding?
Robert Gezelter
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

The volume will not extend on its own. The command to reset the limit to reflect a larger device size must be done. Of course, this is done after the SAN container, or underlying physical volumes are expanded (and the data is moved to them using the appropriate shadowing/mirroring mechanisms).

The change at 7.3-2 was allowing the end of volume number to be changed without dismounting the volume. Thus, as my presentation demonstrated, it is possible to migrate and expand storage volumes while users continue to use the volume without interruption.

- Bob Gezelter, http://www.rlgsc.com
Volker Halle
Honored Contributor
Solution

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

you get HEADERFULL, if the INDEXF.SYS file header has used all of it's space for retrieval pointers in it's file header and the INDEXF.SYS file cannot be extended anymore, thus you can't create any more files on that volume.

In your DFU output, you'll see that as:

INDEXF.SYS fragments/ map_in_use : 4 /11 words ( 7% used)

If this goes to 100%, you can expect to see HEADERFULL soon.

The INDEXF.SYS expansion will only happen, if there are 0 headers free and another free header is needed.

If you have an idea on about how many files will end up on that volume INIT/HEADER=x (with x at least the no. of files you expect) is a very good idea. This will prevent the need of frequent extensions of INDEXF.SYS and reduce the risk of HEADERFULL.

This has nothing to do with DVE (Dynamic Volume Expansion). That feature allows you to increase the no. of disk blocks useable on the disk while the disk is mounted. It won't help with HEADERFULL problems.

Volker.
Andy Bustamante
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

You may also want to consider modifying the /cluster_size. This is one thread http://forums12.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1216312968426+28353475&threadId=1150519 with discussion of points to consider.

The /directories switch allows you to reserve a larger master file directory (000000.dir) from the start, if you many files are arranged in many directories.

Andy Bustamante

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
John Gillings
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

Some history of HEADERFULL...

Today I can walk into any newsagent or supermarket and buy a 1 or 2GB USB thumb drive from a "clearance sale" box for less then $10. Back in the early 1990s an RZ26 SCSI disk (1GB) cost $2000. When VMS was first designed, a 50MB disk was the size of a washing machine and cost more than $20000.

Up to VMS V5.5 the default for /HEADERS was 16 and the algorithm for expanding INDEXF.SYS was very simple "1000 blocks". This was appropriate when each byte of disk space had a significant cost.

Since a file header has up to about 70 extent slots, that meant, by default, INDEXF.SYS would reach a maximum size of about 70000 blocks, giving enough headers for maybe 60000 files (depending on disk and cluster size). This was fine for sub 1GB disks, but as disk sizes grew it became a serious limit.

Although it was possible to initialise disks with higher initial allocations of headers, the expansion algorithm was inadequate, and reinitialising a disk and rebuilding it was non-trivial and required downtime (backup and restore took a long time, and since disks were expensive, to would usually be out to tape and back again to the same disk, so therefore a risk).

As of VMS V6, the expansion algorithm was changed to consider the amount of free space, the number of files already on the disk and their average size. This meant that the system has 50 or 60 attempts to predict the final requirements for the disk. Most times it would expand to the correct size in 2 or 3 steps. If you have a very odd usage pattern you might confuse it, or fragment the disk too quickly, but most disks that ever reach HEADERFULL were initialised pre V6.0.

That said, if you have a good estimate for the number of files you will store on a disk, it's in your interest to give that information to the operating system when you initialise the disk, rather than expect it to be discovered.
A crucible of informative mistakes
Debasis Bhar
Occasional Advisor

Re: Relationship of File Headers to Maximum Files Allowed

Just curious. If i know that i will be having nearly XXXXX files in my disk shall i use /header=XXXXX or /Maximum_files=XXXXX ?

Regards,
Deb
Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

RE:"Just curious. If i know that i will be having nearly XXXXX files in my disk shall i use /header=XXXXX or /Maximum_files=XXXXX ?"

/header

You always want /maximum_files to be larger than needed. The "cost" of used disk space is 1 block for every 4096 files, so it isn't expensive (it it a bitmap representing possible file headers in INDEXF.SYS). If you initialized without /max big enought, the cost is reinitializing the disk.

/header preallocates the headers in as few extents as possible, thus leaving the possibility to easily extend if all headers are used up.

DFU (at least V3.2) can extend and defrag the INDEXF.SYS file (see DFU> help indexf), but it can't (currently) modify the /maximum files, so also make maximum files much larger that you expect will be needed.
it depends
Robert Gezelter
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

As noted, /HEADER is the starting point, but the index file can grow. /MAXIMUM_FILES is a far more expensive mis-calculation.

As I noted in my HP Technical Forum presentation, done correctly, it is possible to run for decades without having to take a disk offline. The cost of extra blocks of index header bitmap (1 bit per header, so 4,096 files per block, makes an increase in maximum files an inexpensive safety valve.

Even pre-extending the index file (/HEADERS) is modest in cost, with today's disk sizes.

- Bob Gezelter, http://www.rlgsc.com
Andy Bustamante
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

To Debasis Bhar,

You want headers to be greater than xxxxxx. This allows first for some growth, but in the case of file fragmentation, one file may use multiple headers. Depending on the application set 2 or 3 times xxxxx for headers.

Maximum files must be greater than headers.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net

Re: Relationship of File Headers to Maximum Files Allowed

OK, so I've played around with this a bit. I assigned a new lun and presented it to the system and initialized it with the following command. Why doesn't it show the values for MAX_FILES and HEADERS, in the DFU report, that I used to initialize the disk? Will they just magically appear when the disk needs more or files are created, or have I stepped outside of some boundry?

$init/nohigh/struc=5/limit/headers=4000000/maximum_files=20000000 -
/directories=16000/cluster=3 $1$dga191: ecp_disk3


$ mount $1$dga191: ecp_disk3
%MOUNT-I-MOUNTED, ECP_DISK3 mounted on _$1$DGA191: (LYRA)

$ dfu report $1$dga191:/out=temp.log


***** Volume info for ODS5 volume $1$DGA191: (from HOME block) *****
Volume name : ECP_DISK3
Volume owner :
Volume set name :
Highwater mark. / Erase on del. : No / No
Cluster size : 3
Maximum # files : 16711679
Header count : 10
First header VBN : 4093
Free headers : 0

***** File Statistics (from INDEXF.SYS) *****
INDEXF.SYS fragments/ map_in_use : 4 /11 words ( 7% used)
Total files (ODS2 / ODS5) : 10 / 0
Empty files : 5
Files with allocation : 5
Files with extension headers : 0
Files marked for delete : 0
Directory files : 1
Contiguous files : 5
Total used/ allocated size : 38242 /4070643 blocks
Total headers/ fragments : 10 /5
Average fragments per file : 1.000
File fragmentation index : 0.000 (excellent)
Average size per fragment : 814128 blocks
Most fragmented file :
$1$DGA191:[000000]INDEXF.SYS;1 ( 4102/4004094 blocks; 4 fragments)

***** Free space statistics (from BITMAP.SYS) *****
Total blocks on disk : 419430400
Total free blocks : 415359759
Percentage free (rounded) : 99
Total free extents : 3
Largest free extent (blocks) : 209714163 at LBN: 1035
Average extent size (blocks) : 138453253
Free space fragmentation index : 0.000 (excellent)
$
Hein van den Heuvel
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

The minimum size for a file is a 1 block header + 1 'extend' = a cluster full of blocks.

The cluster size is limited by the maximum of 65535 blocks of 512 bytes = 4096 bits = 268M clusters.

See HELP INIT/CLUS. Here, that gives 400M/268M = 1.x. Which rounds up to 2, and 3 was selected.

That would suggest a maximum of 400M/(1+3) = 100M files.

However... each file must have a unique FID (file id) which is stored as 16 bit primary + 8 bit extention = 24 bits.
So that creates a maximum of 2**24 = 256*256*256 = 4096*4096 = 16777216 = 16M files.

$HELP INIT/MAX is not explicit about this, but does somewhat 'suggest' it with "If /LIMIT is specified and no value is set for /MAXIMUM_FILES, the default is 16711679 files".
So the default is the absolute MAX.

In the DFU report you see this as:

>> Maximum # files : 16711679

The absolute max, minus a 16 block (= 65536 files) fudge factor.

>> First header VBN : 4093
That's the first block after the IBMAP (Index Bit Map), with 4096 headers/block.

>> Total used/ allocated size : 38242 /4070643 blocks

There is the /headers=4000000

>> Most fragmented file :
$1$DGA191:[000000]INDEXF.SYS;1 ( 4102/4004094 blocks; 4 fragments)

And there is the /headers=4000000 again... by chance.

hth,
Hein.





Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie.

I see that Hein has already provided the answer, but here is what I already had written.

Hein's explanation of the fudge factor (16 blocks) makes sense.

The attachment gives some additional info using dump/header and diskblock.

--------------------------------------------------------------------------------

You have specified MAXIMUM_FILES .gt. the design limit, and initialize has used the largest value currently supported.

I created a 200GB virtual disk $1$dga9400 and reproduced your findings.

The INDEXF.SYS file has the space allocated, but the file headers won't be initialized until they are needed.

See the attachment for more info.

Jon

I would also consider a larger cluster size, unless most of your files are small. Even then, I would rather use 4 than 3. Just for comparison, NTFS uses 4K clusters by default, which is 8 blocks.
it depends

Re: Relationship of File Headers to Maximum Files Allowed

Thanks everyone for the help. Those last two replies were spot on for what I was looking for.

Thanks again,
Charlie

Re: Relationship of File Headers to Maximum Files Allowed

All questions answered and issue resolved.
Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

I realize this is closed, but just to add a bit about the cluster size.

With a 200GB disk (419430400 blocks), even if you pre-allocate all the headers for the maximum possible number of files, there will still be over 400,000,000 blocks free. That means that if your average (mean) file size is less than 24 blocks, you will run out of file headers before you run out of free space.

The point is that if you are using a small cluster size because you expect to be creating mostly small files, then it is better to use smaller disks. For example, 4 50GB disks instead of 1 200GB disks. If it has to appear to be one volume, you can use bound volume sets, which allow for up to 16711679 file on each member. Bound volume sets are supported, although in my opinion, they shouldn't be used unless there is no other option. Backup and restore becomes more complex for instance. The original purpose of bound volume sets was to allow for files that were larger than a single drive could hold, now the primary reason they are used is to allow for more than 16M files on a volume.

From a disk fragmentation point of view, it is best to keep small files on a different disks than large files, and to use a larger cluster size on the disks that you expect to have large files on.

The name of your volume suggests that it is being used with performance data collection.
I would expect those files to get much larger than 3 blocks, and if the same disk is being used for multiple concurrent collections, I would expect that the disk will get fragmented rather quickly. I haven't used ECP, so I don't know if it pre-extends its collection files, or if you can control that.

Bottom line: On a 200GB drive, I wouldn't consider a clustersize < 16 and would be more likely to use 32.
it depends

Re: Relationship of File Headers to Maximum Files Allowed

Thanks for the info on cluster size. ECP is also the name of our proprietary software running on VMS. This disk contains files produced by our system, and is not used for performance data collection. We use PSDC on a different disk to monitor performance.

This disk contains a directory for each Group. Each Groups directory contains a subdirectory for each line of business. Each LOB directory contains a subdirectory for each Julian Date. The majority of files are less than 5 blocks with many of those even at 1-3 blocks, so the disk contains many small files in many directories.

I have several years of capacity on this drive so I don't think it's necessary to bind volumes and I'm not sure what that would do to disk performance when using the EVA. Thanks again for the info.
Hein van den Heuvel
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

>> The majority of files are less than 5 blocks with many of those even at 1-3 blocks,


If that's the case, and now that we learned that the max files is actually about 16 million, I would suggest to recreate the volume with a size of 100GB or so.

Cheers,
Hein.