Operating System - OpenVMS
1824976 Members
3801 Online
109678 Solutions
New Discussion юеВ

Re: Block / Cluster calculations

 
SOLVED
Go to solution
Aaron Lewis_1
Frequent Advisor

Block / Cluster calculations

I am trying to optimize a volume to store about 300,000 small RMS files. currently they are on a 9GB drive (DKC100) with the default cluster size of 18. I found an 18GB drive (DKE100)and did an init/limit with the default cluster size of 8.

I have 5 test files, size: 1,2,2,3,5

Pure: 13 blocks (1+2+2+3+5)
DKC100: 90 blocks -- 5 * 18 (1 cluster)
DKE100: 100 blocks -- 5 * 20 (2.5 clusters?)

Why do these files take up more space on a disk with a smaller cluster size? Also, I thought a file had to occupy a full cluster.
17 REPLIES 17
Hein van den Heuvel
Honored Contributor

Re: Block / Cluster calculations

You do not indicate a VMS version.
If space is that critical, maybe you should upgrade to a VMS version that allows more than 1,000,000 clusters on a disk?
Oh wait... you have that already. Never mind.

It's just a matter of math.
8 can only be divided by 2 and 4.
18 can be divided evenly by 2,3,6 and 9.

I'm afraid I did not fully understand the description of the sizes you gave, but one has to assume that the 18 block cluster allows a few more files to fit 'closer'.

I would suggest you try a cluster size of 6 next.


Hint: for hands-on experiment with this consider virtual devices like the LD (logical disk) and notably the MD memory disk driver.

hth,
Hein.
Aaron Lewis_1
Frequent Advisor

Re: Block / Cluster calculations

Hein, I am using 7.3-2. My understanding of RMS files is that each file allocates at least 1 cluster, reguardless of the actual size of the file.

This appears to be true on DKC100, since each of the small files, with real sizes of:
1 block, 2 blocks, 2 blocks, 3 blocks & 5 blocks, each allocate a full cluster of 18 blocks resulting in 90 blocks of storage being used.

However, when I copy these to DKE100, with a cluster size of 8, each file allocates 2.5 clusters, or 20 blocks each using up a total of 100 blocks of storage. Since each file is smaller than the defined cluster size, I would expect it to allocate only 1 cluster each for a total storage of 40 blocks. Also, I did not think that a file could allocate anything less than a full cluster.
Volker Halle
Honored Contributor

Re: Block / Cluster calculations

Aaron,

the allocated file size (DIR/SIZ=ALLOC)always is a multiple of the cluster size, the used file size (DIR/SIZE=USED - this is the default for DIR/SIZE) can be between 0 and allocated size.

Volker.
Volker Halle
Honored Contributor

Re: Block / Cluster calculations

Aaron,

is the cluster size really 8 ???

$ WRITE SYS$OUTPUT F$GETDVI("DKE100","CLUSTER")

Volker.
Hein van den Heuvel
Honored Contributor
Solution

Re: Block / Cluster calculations

> each file allocates at least 1 cluster, reguardless of the actual size of the file.

Right. (allthough 0 clusters is also allowed :-)

>> However, when I copy these to DKE100, with a cluster size of 8,

Ah... copy will round up file allocation as it can not be sure the the EOF is meaningfull. Depends on high-water-marking and such. Application are allowed to put data beyond the 'official' EOF. It would be silly, but it is possible. Copy has to play it safe and copy all. Su each file which needed 1 18 block clusters now will need 3 - 8-block clusters.
That's why I suggested 6 block clusters, as they divide nicely into 18.

>> each file allocates 2.5 clusters

As you suspect, this is wrong. Only whole clusters can be allocated. TO understand how and where, I would recommend using DUMP/HEAD/BLOC=COUN=0 for each of the 5 files and look for the MAPPING POINTERS to get the explanation.

There are more details on the round up / truncate / copy vs backup that I have written about before. I can dig that up if critcally needed.

It comes back to the old 'What problem are you really trying to solve'?
If that problem includes copying from a 18-bluc cluster disk to 8, then we have a problem which you can workaround with clustersize 6.
If the real application will simply create the new files on the new disk, then there is no problem. You'll see 8-blocks / file for every file of 8 or less.


Hein.


Jeroen Hartgers_3
Frequent Advisor

Re: Block / Cluster calculations

with dir/size=all you see used/allocated. VMS 7.3.2 gives you the opetunatie to size down your cluster size during the init of your disk. don't forget to make a backup. init is the same as format(msdos).

If you make your clustersize smaller you need a lager index file and you must calculte de max file for the device by hand.

the hel-p of vms tells you ervery thing about init ( $help init) The easiest way to learn/test this is to use a spare disk and look wath's happening.
Jeroen Hartgers_3
Frequent Advisor

Re: Block / Cluster calculations

with smaller clusters, defragmentation is possible and you need a bigger indexf.sys
Joseph Huber_1
Honored Contributor

Re: Block / Cluster calculations

I don't think it is VMS version dependent, but anyhow:
on my 7.3-1 system I do not see such behaviour. If I copy from big-cluster-sized to smaller-cluster-sized disk, it allocates only the cluster-size.

BUT: copy using backup takes the source allocation (and rounds up to multiple of cluster-size), exactly what You probably see on Your disk.
http://www.mpp.mpg.de/~huber
Robert Gezelter
Honored Contributor

Re: Block / Cluster calculations

Aaron,

Often space allocation will have a "sweet spot" (or an egregiously bad spot) for a population of files.

Consider the following file sizes:

Space Used (by cluster size)
File Size 3 6 9 12 15
1 3 6 9 12 15
10 12 12 18 24 15
20 21 24 27 24 30
40 42 42 45 48 45

This is referred to as "breakage", since, as you noted, it is not possible to allocate "half a cluster".

Out of curiosity, what are these files? Sequential, Indexed, etc.?

- Bob Gezelter, http://www.rlgsc.com
Jan van den Ende
Honored Contributor

Re: Block / Cluster calculations

Aaron,

First, I ASSUME those are sequential files, for indexed or RA files the following does not apply (the reason will be obvious after you read the rest).

Anytime you move a file to a disk with a different clustersize, the _allocated_ size is rounded _UP_ to the lowest multiple of the clustersize of the receiving disk, UNLESS
tou tell the system this is a special case.

If you use BACKUP for copying, directly or via tape, then the /TRUNCATE qualifier will allocate just enough to fit the USED part of sequential files.

You did already do the copy (or was that just a test?). In that case.
$ SET FILE /TRUNC will have the same result.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Garry Fruth
Trusted Contributor

Re: Block / Cluster calculations

Jan has this one. In my travels, I have seen a disk where the files were tremendously over allocated (allocated = 10+ * used). My theory is that data was migrated from one disk, to another larger, to yet another larger disk, .... With each migration, the cluster size of the destination was different, and the files got just a little bit larger each time. A "set file/truncate [*...]*.*;*" with enough privileges recovered a lot of space.
Andy Bustamante
Honored Contributor

Re: Block / Cluster calculations

Jan is correct however $ set file /truncate will give your new disk a start to being fragmented. Best bet is to init the new disk and use backup with the /truncate switch to move files to the new disk.
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Aaron Lewis_1
Frequent Advisor

Re: Block / Cluster calculations

Thanks to everybody that responded. It was a backup issue, doing a copy gave me the results I was looking for. I will use the /truncate on some of my older volumes, and should gain back a lot of space.
comarow
Trusted Contributor

Re: Block / Cluster calculations

Remember, small cluster sizes mean more fragmentation. Also, your defragmenters must work harder. And, I/O is slower.
Is the space saved worth it?

Bob
Willem Grooters
Honored Contributor

Re: Block / Cluster calculations

It can be worthwhile, if you have a lot of files that are (much) smaller than the size of one diskcluster or a multiple of this, AND your diskspace is limited. Either you buy new disks - and probably another cabinet to hold them - which means yet another set of hardware to be purchased/maintained/managed. Not all sites have that ability, and space efficiency can be crucial in those cases. I don't think the avarage user will notice the difference in eaccessing the data from disk.
Willem Grooters
OpenVMS Developer & System Manager
Willem Grooters
Honored Contributor

Re: Block / Cluster calculations

BTW: fragmentation can be prevented: create files with sufficient pre-allocated size, and a fair amount of blocks for each extent - which can be changed later on. "Tuning on beforehand" is advisable, and monitoring file status conscerning updates and extents is practically mandatory, and not just on small disks. I have seen disk with high clustersize (24) hold BADLY fragmented, BIG files, due to bad sizing parameters.
A small clustersize may indeed mean a bigger indexf.sys (including bitmap) but that will be less than the win you could have. Nor do you have to calculate the number of files. Use the default unless you _know_ there will be more.Ok, you may run out of slots in indexf.sys, and rebuild the disk. This is a risk you have to calculate as well.
The bottom line is "planning": know what's going to be on that disk.
Willem Grooters
OpenVMS Developer & System Manager
Aaron Lewis_1
Frequent Advisor

Re: Block / Cluster calculations

In this case fragmentation should not be an issue. This is just an archive volume, once the files are re-located here, they will only be referenced occasionally, and then for read-only purposes.