Operating System - OpenVMS
1753797 Members
7212 Online
108799 Solutions
New Discussion юеВ

Re: Never ending ZIP process ...

 
SOLVED
Go to solution
Jan van den Ende
Honored Contributor

Re: Never ending ZIP process ...

Art,

I met a situation not unlike yours, albeit a little bit smaller.

_DO_ consider Heins SEARCHLIST solution!

And Wim wrote
>>>
This will of course not solve your problem now because tou already are stuck with the big directory.
<<<

Well, that is true, but not completely.

You _CAN_ migrate to an easier structure, but getting there in a more or less system-friendly way DOES require some extra human effort.

- decide what is a "reasonable" quantity of files per directory. (one day, one week, one month?)
- decide which of those units is already completed, and has its files furthest down in the directory file.
- create a suitably named subdirectory for those, and add it in the logical name search list (of course, the active directory goes first).
- rename the applicable files to the subdir. It will not be fast, but A LOT faster than renaming the top ones!
- repeat until all files moved.
(you may well prefer scripting the above!)
- create a temporary directory for the active (new) files
- rename everything still in the old dir to the new one.
- delete the (now empty) _BIG_ directory file.

It is a pain, but worth the effort.

And now, educate the app developpers (if they still can be traced) about their sillyness.

success.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Art Wiens
Respected Contributor

Re: Never ending ZIP process ...

I've restarted the zip process using the /batch qualifier. It's executing in a directory with ~ 60,000 files (the oldest three months I renamed out of the way). The list I gave it is ~ one third of the files. The names in the list are exact ie. no wildcards.

Jon had a question earlier what files the job had open ... as it grinds away right now, it has only the list file open specified by the /batch switch (see attached).

Previous experience with these large ZIP jobs is they don't open the temporary zip file (which gets renamed to the final zip file) until it's "processed" the selection of files. Whatever "processed" entails.

Art
Jon Pinkley
Honored Contributor

Re: Never ending ZIP process ...

Does $1$DGA79 have the files that are being zipped? That device was busy at the time you took the snapshot.

Channel Window Status Device/file accessed
------- ------ ------ --------------------
00D0 00000000 Busy $1$DGA79:
SDA>

That may be activity verifying that the files exist before starting the actual zipping. I don't know enought about the internal format of the zip container layout, perhaps it is similar to a CD in that it may need to create or at least preallocate space for the directory before it adds anything to the zip file.

Steven S. will know.

If it really needs to preallocate table of contents space, then I wonder how it handles a file being deleted between the initial scan and the archival of the file.

Seems odd it would need to scan the whole list if it is explicitly provided.

Jon
it depends
Steven Schweda
Honored Contributor
Solution

Re: Never ending ZIP process ...

> Previous experience with these large ZIP
> jobs is they don't open the temporary zip
> file (which gets renamed to the final zip
> file) until it's "processed" the selection
> of files. Whatever "processed" entails.

Sounds right to me.

> Steven S. will know.

Now _there's_ an optimist.

I believe (glancing at code I normally
ignore) that Zip runs through the whole file
list beofre it does any serious work, looking
for names to be filtered out by the include
and exclude or date-time options, duplicate
names (easy with /NOFULLPATH, -j), the name
of the archive itself, and so on, creating a
big to-do (linked) list as it goes, including
the actual file specs and the names to be
used in the archive (which may be different),
and some other data which will be stored in
the archive for each file.

> [..] I wonder how it handles a file being
> deleted between the initial scan and the
> archival of the file.

I've never tried it. I assume that you get a
complaint.
Art Wiens
Respected Contributor

Re: Never ending ZIP process ...

Things to remember for "next time".

- Don't let 250,000 files accumulate in a directory ;-)

- Use ZIP_CLI for more "familiar" DCL like qualifiers

- If you do, move the files you want to ZIP to a temporary location ie. don't try to ZIP in the directory containing 250,000 files

- Create a list of files you want to ZIP. Use it by supplying the /BATCH=listfile.txt qualifier.

- Use the /MOVE qualifier to automatically delete the files from disk after they have been added to the archive

- Have patience! It takes a lot of time!

Thanks all, I will take the help provided and try to set things up a little smarter for the future.

Cheers,
Art
Steven Schweda
Honored Contributor

Re: Never ending ZIP process ...

> - Create a list of files you want to ZIP.
> Use it by supplying the /BATCH=listfile.txt
> qualifier.

Is that really faster than an equivalent
wildcard, or were you just looking for
finer-grained control over the list?

> - Use the /MOVE qualifier to automatically
> delete the files from disk after they have
> been added to the archive

Zip does try to be careful, but you have more
confidence in this stuff than I. And, I'd
expect one of the more clever DELETE schemes
to be faster at removing the files than Zip.
Art Wiens
Respected Contributor

Re: Never ending ZIP process ...

I don't think using the list was any faster. Performing the zip in a temporary directory was the biggest help in that respect. I renamed three months worth into the temp dir and zipped up a months worth at a time. There is nothing in the filename to indicate which month it's from, and making ZIP look up dates seemed to me like it would add unnecessary overhead, so using list files was my control over which files to ZIP.

WRT confidence ... ZIP/UNZIP have never let me down (on any platform)! First time for everything I guess.

Cheers,
Art
Hein van den Heuvel
Honored Contributor

Re: Never ending ZIP process ...

fwiw, I seem to have posted an intermediate version of the double rename solution.
Almost right, but not taking from the end towards forwards.
Which was the whole point! Ooops

Here is the corrected example
It's just an example, with some debugging lines still there to help understand it.
Adapt to individual needs and perl quirks trying to help with files and filenames.
Or re-write to something similar in DCL.

See also c.o.v discussion: "Which delete statement is faster? Options "

http://groups.google.com/group/comp.os.vms/browse_thread/thread/cfaed0f141068b51?hl=en#

Hein.

use strict;
#use warnings;

my $HELPER = "[-.tmp_helper]";
my $TARGET = "[-.tmp_renamed]";
my $i = 0;
my @files;
$_ = shift or die "Please provid double quoted wildcard filespec";

print "wild: $_\n";
s/"//g;
my $wild = $_;
foreach (qx(DIRECTORY/COLU=1 $wild)) {
chomp;
$files[$i++] = $_ if /;/;
}
die "Please provide double quoted wildcard filespec" if @files < 2;

# phase 1

$i = @files;

print "Moving $i files to $HELPER\n";

while ($i-- > 0) {
my $name = $files[$i];
my $new = sprintf("%s%06d%s",$HELPER,999999-$i,$name);
print "$name --> $new\n";
rename $name, $new;
}

system ("DIRECTORY $HELPER");
# phase 2

print "Renaming from $HELPER to $TARGET...\n";

while ($i++ < @files) {
my $name = $files[$i];
rename sprintf("%s%06d%s",$HELPER,999999-$i,$name), $TARGET.$name;
}


Hope this help better :-)

Hein.