Operating System - OpenVMS
1827705 Members
2805 Online
109967 Solutions
New Discussion

Maximum Directory Entries

 
SOLVED
Go to solution
Robert Atkinson
Respected Contributor

Maximum Directory Entries

I know there's a maximum number of files you can put on a disk (SHOW DEV x /FULL), but does anyone know if there is a limit on the maximum number of entries in a directory, and how this is calculated?

Cheers, Rob.
50 REPLIES 50
Mobeen_1
Esteemed Contributor
Solution

Re: Maximum Directory Entries

Rob,
I don't think there is any such limit...in directory...i have never come across this in my stint in VMS

The only limit on files is on the disk and that is " Maximum files allowed" when you do a sh dev/full

regards
Mobeen
labadie_1
Honored Contributor

Re: Maximum Directory Entries

the max number of files is related to the /maximum_files and the /header qualifiers of the $ initialize command

I think that a directory, IF Vms has enough room to increase the size of a directory file while being contiguous, does not have a short limit.

See help/message diralloc

You have had this error message recently ?
Ian Miller.
Honored Contributor

Re: Maximum Directory Entries

max number of files in a directory is usually limited by finding enough contiguous space for the directory file. Large directories are inefficent to access.

Is there a problem you are trying to solve?
____________________
Purely Personal Opinion
Robert Atkinson
Respected Contributor

Re: Maximum Directory Entries

Nope.

But I do have an appliction, WebReports, that writes all the PDF's to a single directory.

We've only just started rolling it out and there's already 10,000 files in there!

Obviously, for performance reasons, I'm going to have to start breaking this up, but I didn't want to be caught out before getting round to it.

Rob.
Mobeen_1
Esteemed Contributor

Re: Maximum Directory Entries

Rob,
I am sure you would have heard of these too..

Version Limits on files

/MAXIMUM_FILES in init

Since i don't see any option for this while creating a directory, i am assuming that it may not be present (CREATE?DIR)

I would love to see what other people think on these forums.

regards
Mobeen
Mobeen_1
Esteemed Contributor

Re: Maximum Directory Entries

Wim,
Thanks for that link :-)

Looks like so far the answers to this question have been in & around whats been discussed in the link posted by you

regards
Mobeen
Ian Miller.
Honored Contributor

Re: Maximum Directory Entries

Sounds uncomfortable :-)
You may find DFU (V3.0) DIRECTORY commands useful
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

I started a test.

But I noticed that create fails when someone does a EVE of the directory. Old sickness of VMS ... other threat ...

I'll keep you informed

Wim
Wim
Ian Miller.
Honored Contributor

Re: Maximum Directory Entries

"But I noticed that create fails when someone does a EVE of the directory. Old sickness of VMS ... other threat ..."
probable read lock on the directory file preventing update of the directory to add new entry for created file.
____________________
Purely Personal Opinion
Hein van den Heuvel
Honored Contributor

Re: Maximum Directory Entries

The is no hard limit.
There TENDS to be a performance degradation.
In older VMS versions there was a performance knee at a directory size of 128 blocks when doing wildcard lookups.
If filenames are generated 'in oder' with ever increasing names, there is very litle overhead indeed.
If the names are random, then new files will frequently cause the needed to 'shuffle' up a good chunk of the directory to make room for the new name.
Keep names short if you can!
bad: [report_directory]adobe_report_for_july_05_2004_00546.pdf
good: [report_directory]2004070500546.pdf

Divide and conquer:
better:[200407_reports]0500546.pdf

hth,
Hein.
Robert Atkinson
Respected Contributor

Re: Maximum Directory Entries

That's odd...got this from Google earlier, which implies Random names are better :-

With normal random file naming behavior, directory shuffles are infrequent. However, non-random behavior can cause problems. A classic case is a DELETE *.*;* on a big directory. The file wildcarding of course returns the files front to back, and so files are deleted from the front of the directory - precisely the worst order. So the time to delete all the files in a big directory goes with the square of the number of files. If the directory is really huge, it's well worth building a command procedure to delete the files back to front.
Willem Grooters
Honored Contributor

Re: Maximum Directory Entries

Rob,

Ian is right:


max number of files in a directory is usually limited by finding enough contiguous space for the directory file.


You can run into "funny" behaviour if there is too litlle contiguous space to hold an expanded directory file. Especially if free space is scattered. Browse down the forum for some example....

To keep your file collections usable, I gues you wouldn't want 10.000th of files into one directory - let alone it _may_ introduce a performance penalty.

Willem
Willem Grooters
OpenVMS Developer & System Manager
Kris Clippeleyr
Honored Contributor

Re: Maximum Directory Entries

Robert,


So the time to delete all the files in a big directory goes with the square of the number of files. If the directory is really huge, it's well worth building a command procedure to delete the files back to front.


For deleting "large" directories, I find that DFU comes in handy.
Oth, to prevent any application that writes its files in one directory from "over-filling" a certain directory, you can set up a search list of logical names, that point to different physical directories, and at certain time intervals rotate the definition. Writing will happen to the first directory in the list, reading will be tried on all.
E.g.:

$ DEFINE LOG DISK2:[LOG1],DISK2:[LOG2],DISK2:[LOG3]
$ CREATE LOG:T.T
ctrl/Z
$ DIREC LOG

will show T.T in DISK2:[LOG1]

hereafter

$ DEFINE LOG DISK2:[LOG2],DISK2:[LOG3],DISK2:[LOG1]
$ CREATE LOG:X.X
ctrl/Z
$ DIREC LOG

will show T.T in DISK2:[LOG1], and X.X in DISK2:[LOG2]

We used this trick once on an application that created a huge amount of uniquely named logfiles.

Greetz,

Kris

I'm gonna hit the highway like a battering ram on a silver-black phantom bike...
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

First test results on a 2 node 4100 cluster using 7.3.

DCL loop to create files with name starting with 0-9 + fixed name. Started 2 jobs on each node to fill the directory.

After almost 5 hours :
First hour : 4320 files created
Second hour : 3490 files created
Third hour : 2732 files created
So : big directories are slow for file creation.

De directory file is now 7000 blocks and contains 22.000 files.

Test continues wednesday.
Wim
Hein van den Heuvel
Honored Contributor

Re: Maximum Directory Entries


In reply to my reply Robert wrote:

" That's odd...got this from Google earlier, which implies Random names are better :-

With normal random file naming behavior, directory shuffles are infrequent."

Define 'normal'!?

I was assuming (may well be wrong) that the applcation in question just kept on adding files. Deleting was not mentioned (yet). If you just keep on adding, then adding at the end is best. This will cause the directory to grow at least by a cluster and maybe more.

The 'normal' behaviour referred to is probably to have files coming and going over time maybe a few more coming than going. Then linear naming is horrible, when doing the deletes. The adds will be fine, but the deletes will always be from the beginning (first directory block) and every directory block emptied will cause a shuffle down.

A random delete will unlikely empty a block, so will not cause a shuffle. Hopefully it will create enough room for a future random add for a different file, but targetted to the same block


btw... for totally optimal directory packing I forgot to mention dropping 'obvious' file types. Like '.pdf' in a report directory. Just give the exception files an extention. And that extention can be ".F" for fortran source and ".O" for objects... if performance is more important than clarity and easy of use (unlikely!)

Ramblings...

For computer named/used files, where humanoids are not reading any meaning into the file name (more or less like VMSmail extention files), the optimal packing is ofcourse by using a base-36 (or worse!) characterset: 0-9A-Z. Using that, 2 characters can identify 1000+ files, 3 is good for almost 50,000 and 4 can address more than a million, but youhave to add 14 bytes of directory data per entry.
10,000 files woudl then just take 350 blocks of .DIR file. Crazy but possible.

Hein.
Andy Bustamante
Honored Contributor

Re: Maximum Directory Entries

As someone already mentioned the 128 performance block hit for directory size was lifted in VMS 7.2.

You haven't mentioned delete or long term storage. Are these these files out of date in after being displayed or do you use long term storage? For short term use, I use a six directory logical for our web server's reports and a batch job that redefines it in 10 minute intervals. Reports are available for at least 50 minutes, with the sixth directory having the contents deleted once an hour. For very busy sites, distribute the directories among multiple disks.

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Robert Atkinson
Respected Contributor

Re: Maximum Directory Entries

The reports will be kept for at least 6 months, longer if I can find the disk space.

Rob.
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

Test results continued but only 2 jobs instead of 4.

There are now about 35.000 files in my directory that itself is now 11.000 blocks.

No problems yet but :

1) doing dir/siz test.dir takes between 2 and 10 seconds (why ?)

2) doing dir/tot/sin=15:30 takes about 10 minutes (acceptable, old machine)

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

Test results.

60.000 files in directory file of 19.000 blocks.

File creation with names starting with abc takes 0.1 second but every 5 files it takes 10 seconds. And the directory file is locked during that interval.

Jobs doing the dcl create in a loop pagefault at a rate of 75 per second (initial working set of 5800 pages, used about 1700).

Files starting with z have the same results.

Nice feature for a real time application ...


Wim
Willem Grooters
Honored Contributor

Re: Maximum Directory Entries

Wim,

I think your observations can easily be explained:


1) doing dir/siz test.dir takes between 2 and 10 seconds

2) doing dir/tot/sin=15:30 takes about 10 minutes (acceptable, old machine)


I guess this has to do with caching.
The directory-file may need to be reread several times. Don't forget that access to indexf.sys is involved as well - for getting the size.


File creation with names starting with abc takes 0.1 second but every 5 files it takes 10 seconds. And the directory file is locked during that interval.

Jobs doing the dcl create in a loop pagefault at a rate of 75 per second (initial working set of 5800 pages, used about 1700).


Quite possible that each 5th file will need an extension of the directory file - and new (contiguous!) space has to be searched and allocated for the directory! And again, this requirs quite some extra IO.

It makes sense to do some redesign if timing gets critical.

Willem
Willem Grooters
OpenVMS Developer & System Manager
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

Willem,

Now I have 1 job active doing the creates and the directory file is about 20.000 blocks. The file names start with Z and are about 60 characters long.

dir/siz=all takes 15 seconds when the directory is expanded (108 blocks). These 108 blocks don't explain the delay of 15 seconds. There must be some reshuffling that takes time.

Files creation time is 0.5 - 2 seconds (why ?), thus longer than during the previous trial. The 15 seconds now only happen every few hundred files !

The pagefaults have increased to 700 per second !!!

Wim

Wim
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

size 24.000 blocks, 80.000 files
blocking time : 20 seconds when allocating 108 blocks.
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Maximum Directory Entries

size 33.000 block, 113.000 files

Pagefaults down to 200/sec.

Started a create starting with mmm.
20 seconds blockage every 5 files.

Started a create starting with a123.
22 seconds blockage every 5 files. Sometimes a file creation takes 0.1 seconds, sometimes 3 seconds.
Wim