topic Re: Overheads of large .DIR files? in Operating System - OpenVMS

Overheads of large .DIR files?

John McL — Mon, 12 Apr 2010 01:05:23 GMT

We have a situation where we create maybe 18,000 log files per day (the actual number varies depending on what's run) and these hese log files are held for 7 days. The file names contain 28 characters, made up of an unchanging 11-character standard prefix, a 4-digit number (1 to 1999) that might cycle several times in 24 hours, an underscore character, the PID and the file type '.LOG'. (Each filename is unique because these are log files for detached "slave" processes and yes, we have lots of those.) Any node in the cluster (2 to 4 machines) might create these log files.

To date all these files have gone into one directory, making for some .DIR files that exceed 5000 blocks. File creation is spread across the day and deletion, after 7 days, takes place around 4:00am when the system is quiet.

I think we should move to a new log file directory each day (via a logical that rolls over at midnight driven within the image by a timer AST) and that we should use shorter file names (just "S_.LOG"). I believe that the overheads associated with file creation and deletion will be reduced if we go this way.

My knowledge is based on the old performance "knee" at 127 blocks, after which point performance went downhill. I understand performance was improved in VMS v7.3 but I can't find a good description of exactly what altered and what the implications are for inserting and removing file names. In particular I can't find information about when disk I/O's are required (c.f. cache lookups), or about the splitting of blocks in .DIR files when inserting new files names. My understanding is that a lock is taken out on the entire volume when the .DIR file is being modified, so I would like to minimise this lock time as well as the disk I/O time.

In short ...
Q1 - what performance-related practices do you recommend for large numbers of files being created in a single directory, and why?
Q2 - exactly what changes were made in v7.3 and what, if any, performance "gotchas" still exist?

(And if you say that no changes are necessary to our current shema then please explain why.)

Re: Overheads of large .DIR files?

Hoff — Mon, 12 Apr 2010 01:42:00 GMT

Q1: don't do what is being done here?
Q2: big directories are n^2 data structures. And stuff can fall out of cache.

As for a more detailed answer, read this first:

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=625667

Hein and Mark Hopkins in particular describe this stuff in some detail over there.

You could choose to fix this design flaw, or you could choose to reorganize the allocations (incremental additions are best added to the end of the directory; log files that sort alphabetically and increasing), or you can investigate what can be done to avoid creating the log files or such; addressing the environment holistically.

As for removing the files, reverse delete (DCL or DFU or a program) will help with that aspect of performance.

If this is the lowest of the low-hanging fruit for local performance, go for it.

Updating the application design might help, too; I've seen cases where revisiting or rethinking that can be beneficial; where the design is the "dead elephant in the room."

Re: Overheads of large .DIR files?

P Muralidhar Kini — Mon, 12 Apr 2010 01:45:59 GMT

Hi John,

Directory file is a logically contiguous file. When lot of files are added
in a directory, the directory file size increases. But for this to happen
the system should have enough contiguous space.

>> My understanding is that a lock is taken out on the entire volume when
>> the .DIR file is being modified, so I would like to minimize this lock
>> time as well as the disk I/O time.

Directory is a file, whose contents are filenames along with some
attributes such as Version Limit, Version, FID, SYMLINK information and
so on. When creating/deleting new files in a directory, the directory files
needs to be modified to reflect this operation. For this, serialization
lock is taken out on the directory. This blocks only those XQP threads
that want to access the same directory. But other activities on the volume
can proceed. i.e. you can create/delete files in some other directory on
the same disk at the same time.

>> I think we should move to a new log file directory each day
Yes this sounds like a good idea.
With this setup, the directory for every day would be smaller.

With a single large directory,
For the first day the directory gets filled up (lets say Filenames a.txt
to z.txt). For the subsequent days, when files are created (say c.txt),
then based on whether the block where c.txt has to be inserted is full
or not, XQP would have to do Expand Shuffle (move d.txt to z.txt one block
below) to insert new entries in a directory file.

When files needs to be deleted (say d.txt), then based on whether the
block having d.txt is full or not, XQP would have to do compress shuffle
(move e.txt to z.txt one block above) to delete entries in a directory
file.

If we have a day wise directory,
Then the number of Expand/Compress Shuffle that XQP does can be minimized.
i.e. As every day has its own directory, the day wise create/delete would
act on only its corresponding directory with few number of entries
(as compared to only a single directory with large number of entries).

>> Any node in the cluster (2 to 4 machines) might create these log files.
Is distributed lock manager involved here ?

Will get back to you on the performance related practices.

Regards,
Murali

Re: Overheads of large .DIR files?

Jon Pinkley — Mon, 12 Apr 2010 01:54:55 GMT

Having a directory per day will be more efficient; I doubt you will find anyone that will say otherwise.

Shortening the file names will allow more file names to be stored in each directory block, so the 18,000 files will require fewer blocks in the directory file.

Just curious, will having the PID in the log file name be very useful? How will a user know what PID was theirs? If the files have a constant name, the directory will be much more compact, since each version must only store the version number and file ID of the specific file (when it is in the same directory block). For example, the following command file will create 1000 versions of "THIS_IS_A_LONG_FILE_NAME_THAT_WILL.HAVE_MANY_VERSIONS" in a 10 block directory.

$ cre/dir/all=10 [.itrctest]
$ cnt=1
$top:
$ cre [.itrctest]this_is_a_long_file_name_that_will.have_many_versions;
$ cnt = cnt + 1
$ if cnt .le. 1000 then goto top
$end:
$ exit

Another advantage of one per day, is that you can easily delete all the files in the directory at the end of the 7 day waiting period. One of the most efficient ways to do that would be with DFU.

For example to delete device:[20100404...]*.*;*

$ dfu delete/directory/tree/nolog device:[000000]20100404.dir

If you use dfu to delete the directories, it will also delete the directory too. I am not aware of a way to delete the files without the directory using DFU. So I would recommend using

$ create/directory/allocation=5000

when creating the empty directories, to avoid constant directory expansions (which will probably involve recopying the current contents to a new location on disk each time, since VMS directory files must be contiguous).

You may want to create a search list logical name that will include all the 7 day's worth of directories, so it will be possible to find a log file from a previous day using a simple directory command:

For example

$ define[/system] applog device:[20100411],device:[20100410],device:[20100409],device:[20100408],device:[20100407],device:[20100406],device:[20100405]

$ directory applog:mylog.log

Jon

Re: Overheads of large .DIR files?

John McL — Mon, 12 Apr 2010 02:22:19 GMT

Hoff,
Mark's & Hein's comments were interesting but were largely focused on file deletes. This isn't a big issue because we're using ZIP with the "remove" (or is it "move"?) option. (For those not familiar, it's like BACKUP/DEL). Moreover this runs in a batch job at about 4:00am when performance isn't as important as from 8am to 9pm or theerabouts.

We are working on reducing the number of log files because some are for "slave" processes that ultimately did nothing because of what other slaves did, but this situation seems unpredictable and depends on job mix.

Re: Overheads of large .DIR files?

John McL — Mon, 12 Apr 2010 02:27:30 GMT

Muralidhar,
I'd forgotten about the contiguous requirement, so thanks for that reminder.

Are you sure about the lock ONLY being on the .DIR file during creates and deletes? What about the changes to INDEXF.SYS and BITMAP.SYS? Aren't there two locks, one for the directory and one for the volume?

The other point about daily directories is that each is independent and that a very large number of files created on one day won't continue being a performance problem for the other 6 days.

Re: Overheads of large .DIR files?

John McL — Mon, 12 Apr 2010 02:39:24 GMT

Jon,

I agree with you that multiple versions would be better (smaller .DIR and fast access) and I've confirmed that using the cycling 4-digit number we currently use in the filename, however ...

The slave processes are named according to their PID and the PID is displayed by certain management utilities, so short of some convoluted translation system and telling everyone how to use it I'm pretty much stuck with using the PID.

The saving grace is that the log files will always(?) be added to the end of the list for the machine on which the slave process is running (i.e. on a 2-node evenly balanced system new filenames will be entered at a point 50% down the file and at the end.

Sure filenames in the form _.LOG would guarantee end of file additions but now we've taken a 14 character filename out to 21 characters (orig size was 28 chars) and .DIR file size has grown. Is that a better tradeoff? I'm not sure.

Re: Overheads of large .DIR files?

P Muralidhar Kini — Mon, 12 Apr 2010 02:39:30 GMT

Hi John,

From Mark's comment in the above link, note the following thing

>> Note that the create/dir command allows you
>> to allocate the space up front so you don't
>> have to endure the frequent extends for
>> directories expected to be very large.

Apart from the Expand and Compress shuffle operation of the directory
file, this talks about moving directory file to some other location on
the disk.
When adding entries to directory file, if there is no contiguous space
(i.e. contiguous space after the current directory file location on the
disk) then the directory would be moved to some other location of the disk
where the contiguous space is available. By pre-allocating directory file
on the disk using the create command, this can be avoided.
you may want to try this out also.

>> Are you sure about the lock ONLY being on the .DIR file during creates
>> and deletes? What about the changes to INDEXF.SYS and BITMAP.SYS?
>> Aren't there two locks, one for the directory and one for the volume?

You are correct. When creating a file, a lot of other operations are
involved other than creating a directory entry for that file. Space needs
to be grabbed from the disk and so on. For these different set of locks
are used and some of them would block the activity on the entire volume.

I was referring in particular to operation of adding/removing entries from
the directory files for which only the serialization lock on the directory
file is taken. This is where there is lot of scope for optimization and
reduce the amount of time taken.

>> The other point about daily directories is that each is independent
>> and that a very large number of files created on one day won't continue
>> being a performance problem for the other 6 days.
Yes. Also as i said before, with smaller directories, XQP would not have to
do a lot of activity for Expand/Compress shuffle of the directory.
This would be a performance benefit.

Regards,
Murali

Re: Overheads of large .DIR files?

Steven Schweda — Mon, 12 Apr 2010 02:50:07 GMT

> [...] largely focused on file deletes. This
> isn't a big issue because we're using ZIP
> with the "remove" (or is it "move"?)
> option. [...]

Zip has no special code to make deleting
files any faster than anything else.
Depending on how it's used, I'd expect it to
use some sub-optimal (perhaps anti-optimal)
order when deleting the files in a directory.

Re: Overheads of large .DIR files?

John McL — Mon, 12 Apr 2010 03:06:57 GMT

Steven,

I'm in no position to be able to modify the ZIP technique that we use on a whole range of files.

As I said, I'm not overly concerned with what happens at 4:00am unless it ultimately impacts the major processing that occurs between 8:00am and about 9pm.

Using separate daily directories will also mean that deletions of the log files there won't have .DIR management overheads potentially impacting other files (although the volume lock would do so briefly).

Re: Overheads of large .DIR files?

H.Becker — Mon, 12 Apr 2010 06:09:51 GMT

Whether deleting files is a problem or not
>>>
When files needs to be deleted (say d.txt), then based on whether the
block having d.txt is full or not, XQP would have to do compress shuffle
(move e.txt to z.txt one block above) to delete entries in a directory
file.
<<<
seems to have a typo: "whether the
block having d.txt is _empty_ or not".

Re: Overheads of large .DIR files?

P Muralidhar Kini — Mon, 12 Apr 2010 06:24:36 GMT

Hi Becker,

Yes. That was a typo.
I meant "whether the block having d.txt is _empty_ or not".

Thanks for pointing that out.

Regards
Murali

Re: Overheads of large .DIR files?

Jan van den Ende — Mon, 12 Apr 2010 07:06:36 GMT

Ahhrg; posting failure(??)
This is a retry, please excuse a possible duplicate.

jpe

John,

>>>
Sure filenames in the form _.LOG would guarantee end of file additions but now we've taken a 14 character filename out to 21 characters (orig size was 28 chars) and .DIR file size has grown. Is that a better tradeoff? I'm not sure.
<<<
That explains easily:

>>>
contain 28 characters, made up of an unchanging 11-character standard prefix,
<<<
Consecutive directory entries get compacted where possible:
Each entry begins with the COUNT of characters from the previous entry that is repeated, followed by only the differing chars.
In the original setup, you only had the 11 char prefix ONCE, thereafter only the (one-byte) count (btw, hence the 255 char length limit).
Your new approach only allows for condensing off 6 bytes in the same second (6 digits + 1 "_" - 1 byte), 4 bytes in the same 10-second period, and regularly less when that rolls.

Summary: long same-character LEADING substrings are relatively cheap, but varying chars early in the string force the remainder to be verbatim.

hth

Proost.

Have one on me.

jpe

Re: Overheads of large .DIR files?

Robert Gezelter — Mon, 12 Apr 2010 07:24:15 GMT

John McL,

I will have to keep this brief, as I need to get ready for an early meeting.

Another consideration is whether these files need be stored in a single unitary directory.

Two relatively straightforward options present themselves:

- use a different directory per day, for searching purposes, use a logical name which concatenates six days of directories into a search list (this approximates the present behavior).

- rather than a flat directory with a 28 character filename, consider creating a second-level set of subdirectories based upon part of that name (e.g., PID) with individual files being entered in the subdirectories. The subdirectories would be entered in the search list as described above.

Combining the above two alternations would provide the scope of a seven-day retention, together with far smaller individual directories. As has been noted previously by other posters, the work involved in adding/removing files is loosely related to the size of the directory, thus segmenting the add/delete problem has potentially larger than linear payoff.

Searching is then handled using a logical name with a search list.

- Bob Gezelter, http://www.rlgsc.com

Re: Overheads of large .DIR files?

H.Becker — Mon, 12 Apr 2010 11:01:40 GMT

>>>
Consecutive directory entries get compacted where possible:
<<<

The directory entries are always compacted within a disk block: all the free space is at the end. But entries are not compressed. What is described here, sounds like front end key compression for RMS index files.

Re: Overheads of large .DIR files?

Hein van den Heuvel — Mon, 12 Apr 2010 11:56:06 GMT

>> we should use shorter file names (just "S_.LOG").

Yeah, excellent. More dense is better, but try to get some movement in those leading characters. The XQP will use them as an index/accelerator. From what you described so far _S.log might b a good alternative. might do. The leading part if the PID is often invariant or slow-variant. You might want to skip that?
Any concerns on re-boots and pids recycling?

>> Q1 - what performance-related practices do you recommend for large numbers of files being created in a single directory, and why?

a) Try to avoid that. Try exploit subdirectories and search-lists of directories. The latter may now need any application change.
b) If you can not use the searchlist/subdirectory, then try NOT to add in ever increasing order. Randomness where periodic deletes do NOT empty out the entire directory block are ideal. That way new files will typically re-use directory space from past files, and the directory will stay more or less constant in size, avoiding shuffles to squeeze out empty directory blocks, and open up for a fresh block.

Hmmm... I never implemented anything like this, but with a somewhat predictable re-use pattern, you could seed the directory with constant entries to stop blocks from emptying out.
Write a tiny program to read the directory as sequential file. Every time the RFA_VBN changes, take the first N characters from the name, and call sys$enter to do something similar to: SET FILE/ENT=nnnnn.X $_place_holder.X. Just leave those around.

>> Q2 - exactly what changes were made in v7.3 and what, if any, performance "gotchas" still exist?

a) RMS was thought to user a directory buffer greater than 127 block if needed.
b) The XPQ was thought to not just use a single block buffer during directory shuffles, but use SYSGEN PARAM ACP_MAXREAD as buffers size. That's typicaly set to 32, reducing large directory operations by that factor.

Jan... you confused RMS INdexed file key compression with directories. Nice thought, but no, directory entries are NOT compressed.

Cheers,
Hein

Re: Overheads of large .DIR files?

Hoff — Mon, 12 Apr 2010 12:26:51 GMT

Q1: Why are you even looking at this stuff?

Q1a: aesthetics?

Q2: If deletes aren't an issue, then what is an issue?

Q3: what are your critical tasks and where are those bottlenecked for resources?

Q3a: Is that with big directories?

Re: Overheads of large .DIR files?

Hein van den Heuvel — Mon, 12 Apr 2010 13:17:03 GMT

Hmmm,

I think seeding a certain directory for a customer of mine might be helpful.
So I wrote the helper program I suggested earlier to report the first record in each directory block, the numbers of records for that block, and a final line to indicate how many leading characters would be needed to create unique names. Source below.

It already provided useful insights for me ( using my twisted definition of useful :-).

I could envision someone using something like this to get an insight in how directory entries might move around, using some snapshots. A quick DCL hack could use the output as driver file to actually create seeds directory name entries.

Enjoy,
Hein.

$ type CHECK_DIRECTORY.C
#include
#include
#include
#include

int sys$open(), sys$connect(), sys$get();
main(int argc, char *argv[]) {

struct FAB fab;
struct RAB rab;
struct { short verlimit; unsigned char flags, namecount; char name[508]; } directory_record;
int s, i, records=0, old_records=0, blocks=0, minimum=0, old_vbn=0;
char old_name[256];

if (argc != 2) return 16;
fab = cc$rms_fab;
fab.fab$l_fna = argv[1];
fab.fab$b_fns = strlen(argv[1]);
fab.fab$b_fac = FAB$M_GET;
fab.fab$b_shr = FAB$M_SHRUPD | FAB$M_SHRGET;

rab = cc$rms_rab;
rab.rab$l_fab = &fab;
rab.rab$l_ubf = (char *) &directory_record;
rab.rab$w_usz = sizeof (directory_record);

s = sys$open (&fab);
if (s&1) s = sys$connect (&rab);
while (s&1) {
directory_record.namecount = 0; // EOF
s = sys$get(&rab);
if (!(s&1) && s != RMS$_EOF) break;
records++;
if (old_vbn == rab.rab$l_rfa0 ) continue; // Same block test.
if ( blocks++ ) printf ("%06d %02d %s\n", // First block is special
old_vbn, records - old_records, old_name);
for (i=0; i < directory_record.namecount; i++) // Uniqueness test.
if (old_name[i] != directory_record.name[i]) break;
if (i > minimum) minimum = i;
directory_record.name[directory_record.namecount] = 0;
strcpy ( old_name, directory_record.name);
old_vbn = rab.rab$l_rfa0;
old_records = records;
}
printf ("\n%d records in %d blocks, minimum length %d.\n", records, blocks, minimum );
if (s == RMS$_EOF) s = 1;
return s;
}

Re: Overheads of large .DIR files?

H.Becker — Mon, 12 Apr 2010 13:22:09 GMT

As already proposed, I would create the directory with a suitable allocation size. That avoids looking for contiguous disk space and copying the whole directory when the directory is growing in size.

Pre-seeding the directory sounds interesting. Instead of pre-seeding you can also re-use a directory file after deleting almost all of the old log files.

Using Hein's idea, looking at the first entry of each disk block in the directory tells you, which filenames you want to keep. "Pre-seeding" now means to delete all other files (and probably set EOF to zero for the remaining ones). This shouldn't be slower than deleting ALL files: no disk block shuffling any more.

Now, if the pattern of the created filenames is similar enough, the available directory blocks will have enough room for most of the new files. So there will not be much copying within the disk file.

There will be a lot of files in an "empty" directory. This may be VERY confusing.

All this may be possible to code in DCL. Dump/dir can be used to find the filenames to keep.

However, given the confusion and with writing/changing all the command procedure it may not be worth the effort.

Re: Overheads of large .DIR files?

Andy Bustamante — Mon, 12 Apr 2010 15:35:51 GMT

John,

Besides the comments on directory structure and using a logical, how large are your log files compared to the cluster size? You'll need to review disk capacity against space since a larger disk cluster (or allocation size) will use more space, but can reduce i/o operatations overall.

Another option, is to spread your logical over 2 disks, alternating by day. Schedule the delete pass on disk 0 while disk 1 is active for new logs. Have a batch job modify the logical each night.