Operating System - OpenVMS
1748285 Members
3621 Online
108761 Solutions
New Discussion юеВ

Re: Relationship of File Headers to Maximum Files Allowed

 
SOLVED
Go to solution
Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

RE:"Just curious. If i know that i will be having nearly XXXXX files in my disk shall i use /header=XXXXX or /Maximum_files=XXXXX ?"

/header

You always want /maximum_files to be larger than needed. The "cost" of used disk space is 1 block for every 4096 files, so it isn't expensive (it it a bitmap representing possible file headers in INDEXF.SYS). If you initialized without /max big enought, the cost is reinitializing the disk.

/header preallocates the headers in as few extents as possible, thus leaving the possibility to easily extend if all headers are used up.

DFU (at least V3.2) can extend and defrag the INDEXF.SYS file (see DFU> help indexf), but it can't (currently) modify the /maximum files, so also make maximum files much larger that you expect will be needed.
it depends
Robert Gezelter
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie,

As noted, /HEADER is the starting point, but the index file can grow. /MAXIMUM_FILES is a far more expensive mis-calculation.

As I noted in my HP Technical Forum presentation, done correctly, it is possible to run for decades without having to take a disk offline. The cost of extra blocks of index header bitmap (1 bit per header, so 4,096 files per block, makes an increase in maximum files an inexpensive safety valve.

Even pre-extending the index file (/HEADERS) is modest in cost, with today's disk sizes.

- Bob Gezelter, http://www.rlgsc.com
Andy Bustamante
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

To Debasis Bhar,

You want headers to be greater than xxxxxx. This allows first for some growth, but in the case of file fragmentation, one file may use multiple headers. Depending on the application set 2 or 3 times xxxxx for headers.

Maximum files must be greater than headers.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
CharlieCalhoun
Advisor

Re: Relationship of File Headers to Maximum Files Allowed

OK, so I've played around with this a bit. I assigned a new lun and presented it to the system and initialized it with the following command. Why doesn't it show the values for MAX_FILES and HEADERS, in the DFU report, that I used to initialize the disk? Will they just magically appear when the disk needs more or files are created, or have I stepped outside of some boundry?

$init/nohigh/struc=5/limit/headers=4000000/maximum_files=20000000 -
/directories=16000/cluster=3 $1$dga191: ecp_disk3


$ mount $1$dga191: ecp_disk3
%MOUNT-I-MOUNTED, ECP_DISK3 mounted on _$1$DGA191: (LYRA)

$ dfu report $1$dga191:/out=temp.log


***** Volume info for ODS5 volume $1$DGA191: (from HOME block) *****
Volume name : ECP_DISK3
Volume owner :
Volume set name :
Highwater mark. / Erase on del. : No / No
Cluster size : 3
Maximum # files : 16711679
Header count : 10
First header VBN : 4093
Free headers : 0

***** File Statistics (from INDEXF.SYS) *****
INDEXF.SYS fragments/ map_in_use : 4 /11 words ( 7% used)
Total files (ODS2 / ODS5) : 10 / 0
Empty files : 5
Files with allocation : 5
Files with extension headers : 0
Files marked for delete : 0
Directory files : 1
Contiguous files : 5
Total used/ allocated size : 38242 /4070643 blocks
Total headers/ fragments : 10 /5
Average fragments per file : 1.000
File fragmentation index : 0.000 (excellent)
Average size per fragment : 814128 blocks
Most fragmented file :
$1$DGA191:[000000]INDEXF.SYS;1 ( 4102/4004094 blocks; 4 fragments)

***** Free space statistics (from BITMAP.SYS) *****
Total blocks on disk : 419430400
Total free blocks : 415359759
Percentage free (rounded) : 99
Total free extents : 3
Largest free extent (blocks) : 209714163 at LBN: 1035
Average extent size (blocks) : 138453253
Free space fragmentation index : 0.000 (excellent)
$
Hein van den Heuvel
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

The minimum size for a file is a 1 block header + 1 'extend' = a cluster full of blocks.

The cluster size is limited by the maximum of 65535 blocks of 512 bytes = 4096 bits = 268M clusters.

See HELP INIT/CLUS. Here, that gives 400M/268M = 1.x. Which rounds up to 2, and 3 was selected.

That would suggest a maximum of 400M/(1+3) = 100M files.

However... each file must have a unique FID (file id) which is stored as 16 bit primary + 8 bit extention = 24 bits.
So that creates a maximum of 2**24 = 256*256*256 = 4096*4096 = 16777216 = 16M files.

$HELP INIT/MAX is not explicit about this, but does somewhat 'suggest' it with "If /LIMIT is specified and no value is set for /MAXIMUM_FILES, the default is 16711679 files".
So the default is the absolute MAX.

In the DFU report you see this as:

>> Maximum # files : 16711679

The absolute max, minus a 16 block (= 65536 files) fudge factor.

>> First header VBN : 4093
That's the first block after the IBMAP (Index Bit Map), with 4096 headers/block.

>> Total used/ allocated size : 38242 /4070643 blocks

There is the /headers=4000000

>> Most fragmented file :
$1$DGA191:[000000]INDEXF.SYS;1 ( 4102/4004094 blocks; 4 fragments)

And there is the /headers=4000000 again... by chance.

hth,
Hein.





Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

Charlie.

I see that Hein has already provided the answer, but here is what I already had written.

Hein's explanation of the fudge factor (16 blocks) makes sense.

The attachment gives some additional info using dump/header and diskblock.

--------------------------------------------------------------------------------

You have specified MAXIMUM_FILES .gt. the design limit, and initialize has used the largest value currently supported.

I created a 200GB virtual disk $1$dga9400 and reproduced your findings.

The INDEXF.SYS file has the space allocated, but the file headers won't be initialized until they are needed.

See the attachment for more info.

Jon

I would also consider a larger cluster size, unless most of your files are small. Even then, I would rather use 4 than 3. Just for comparison, NTFS uses 4K clusters by default, which is 8 blocks.
it depends
CharlieCalhoun
Advisor

Re: Relationship of File Headers to Maximum Files Allowed

Thanks everyone for the help. Those last two replies were spot on for what I was looking for.

Thanks again,
Charlie
CharlieCalhoun
Advisor

Re: Relationship of File Headers to Maximum Files Allowed

All questions answered and issue resolved.
Jon Pinkley
Honored Contributor

Re: Relationship of File Headers to Maximum Files Allowed

I realize this is closed, but just to add a bit about the cluster size.

With a 200GB disk (419430400 blocks), even if you pre-allocate all the headers for the maximum possible number of files, there will still be over 400,000,000 blocks free. That means that if your average (mean) file size is less than 24 blocks, you will run out of file headers before you run out of free space.

The point is that if you are using a small cluster size because you expect to be creating mostly small files, then it is better to use smaller disks. For example, 4 50GB disks instead of 1 200GB disks. If it has to appear to be one volume, you can use bound volume sets, which allow for up to 16711679 file on each member. Bound volume sets are supported, although in my opinion, they shouldn't be used unless there is no other option. Backup and restore becomes more complex for instance. The original purpose of bound volume sets was to allow for files that were larger than a single drive could hold, now the primary reason they are used is to allow for more than 16M files on a volume.

From a disk fragmentation point of view, it is best to keep small files on a different disks than large files, and to use a larger cluster size on the disks that you expect to have large files on.

The name of your volume suggests that it is being used with performance data collection.
I would expect those files to get much larger than 3 blocks, and if the same disk is being used for multiple concurrent collections, I would expect that the disk will get fragmented rather quickly. I haven't used ECP, so I don't know if it pre-extends its collection files, or if you can control that.

Bottom line: On a 200GB drive, I wouldn't consider a clustersize < 16 and would be more likely to use 32.
it depends
CharlieCalhoun
Advisor

Re: Relationship of File Headers to Maximum Files Allowed

Thanks for the info on cluster size. ECP is also the name of our proprietary software running on VMS. This disk contains files produced by our system, and is not used for performance data collection. We use PSDC on a different disk to monitor performance.

This disk contains a directory for each Group. Each Groups directory contains a subdirectory for each line of business. Each LOB directory contains a subdirectory for each Julian Date. The majority of files are less than 5 blocks with many of those even at 1-3 blocks, so the disk contains many small files in many directories.

I have several years of capacity on this drive so I don't think it's necessary to bind volumes and I'm not sure what that would do to disk performance when using the EVA. Thanks again for the info.