- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- ZIP performance
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 05:36 AM
09-03-2009 05:36 AM
Accounting information:
Buffered I/O count: 1315155 Peak working set size: 82640
Direct I/O count: 49565437 Peak virtual size: 247648
Page faults: 6110 Mounted volumes: 0
Charged CPU time: 0 11:34:52.42 Elapsed time: 1 05:31:52.67
The actual command used was:
$ zip_cli 2007_1.zip /batch=2007_TOZIP.LIS /move/keep/vms/nofull_path
Is there any performance advantage instead of using a /BATCH list file to using wildcards for the input and using /SINCE and /BEFORE to select the input? Or is there any way in general to do this more efficiently / expeditiously?
Steven, I hope I have provided enough details to help your psychic powers along. ;-)
Cheers,
Art
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 05:59 AM
09-03-2009 05:59 AM
Re: ZIP performance
I presume that the intent is to ZIP and remove the files in one operation.
Are you sure that the problem is ZIP and not the reorganization of the directories as the files are removed? Along a similar vein, have you tried different orderings of the files in the "listoffiles" (Hint: Inverse ordering per the discussions about delete optimization).
What is the speed of the ZIP without the /MOVE?
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:08 AM
09-03-2009 06:08 AM
Re: ZIP performance
The ZIP process seems to take a lot of time up front, about midway through the ordeal it actually creates a temporary ZIP file Zxxxxxxx and starts adding files to it.
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:15 AM
09-03-2009 06:15 AM
Re: ZIP performance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:17 AM
09-03-2009 06:17 AM
Re: ZIP performance
I was referring to the order of processing files, not which directory you (or the ZIP file are in).
My thinking was that the problem may not be ZIP, and may be related to the traversals and re-structuring of the directory during the deletes.
For control purposes (although I admittedly am reluctant to suggest it), it would be useful to know the performance of a simple wildcard COPY *.* NL: and a DELETE *.*. If these take similar times, the problem is in the size of the directory, not ZIP.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:19 AM
09-03-2009 06:19 AM
Re: ZIP performance
To produce an inverse ordered list of file (within a directory), produce the list using DIRECTORY/BRIEF/NOHEAD/NOTAIL and use SORT to sort in DESCENDING order (HELP SORT).
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:27 AM
09-03-2009 06:27 AM
Re: ZIP performance
Duh! Sorry, I'm still worn out after 29 hours of ZIP'ing ;-)
I have more to do, I'll give that a whirl.
Cheers,
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:29 AM
09-03-2009 06:29 AM
Re: ZIP performance
So someone really does use that CLI stuff.
Normally, I'd suggest trying the current
released version, 3.0, but I doubt that it
would help here. (Might be fun, though.)
> [...] /nofull_path
Are all the files in one directory? With
/move telling it to delete the files, if
they're all in one place, then the delete
operation itself might be very slow (and out
of my hands). A test without /move might be
informative. I'd need to think/look, but if
it deletes the files in the same order as it
adds them to the archive, then a
reverse-sorted list might help (less
reshuffling).
> Is there any performance advantage instead
> of using a /BATCH list file to using
> wildcards for the input and using /SINCE
> and /BEFORE to select the input?
Knowing nothing, I wouldn't expect it to
matter. (Only one way to find out...)
Using the list does give you control over
the order, so if that matters, then the list
might be better.
> [...] enough details [...]
It's a good start, but without some
profiling, it's hard to guess where it's
spending its time.
Adding /COMPRESSION = STORE ("-0") would
eliminate any CPU time spent doing
compression, but I doubt that that's the big
problem.
Ah. I'm submitting too slowly.
> The ZIP process seems to take a lot of
> time up front, about midway through the
> ordeal it actually creates a temporary ZIP
> file Zxxxxxxx and starts adding files to
> it.
It does do some research on the files to be
archived before it starts to work. I thought
that it was mostly checking for existence,
but there might be more to it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:56 AM
09-03-2009 06:56 AM
Re: ZIP performance
The processing of a directory holding 80,000 files implies a directory well in excess of the XQP caches.
I did not ask Art, but another thing that comes to mind is ensuring that the XFC is enabled and sized appropriately to the task at hand.
Just the directory searches on a directory of that size would likely be expensive.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 06:59 AM
09-03-2009 06:59 AM
Re: ZIP performance
System Memory Resources on 3-SEP-2009 10:57:55.77
Virtual I/O Cache
Total Size (Kbytes) 3200 Read IO Count 1003983095
Free Kbytes 0 Read Hit Count 176445148
Kbytes in Use 3200 Read Hit Rate 17%
Write IO Bypassing Cache 23261386 Write IO Count 38318456
Files Retained 99 Read IO Bypassing Cache 275267547
$ show sys/noproc
OpenVMS V7.2-2 on node xxxxxx 3-SEP-2009 10:58:17.64 Uptime 85 21:26:23
As I said, there's only 512MB memory in this box.
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 07:14 AM
09-03-2009 07:14 AM
Re: ZIP performance
> Just the directory searches on a directory
> of that size would likely be expensive.
The FAQ suggests that V7.2 is new enough to
get the boost to the directory cache stuff,
so a VMS upgrade might not help, either.
A quick look at the code suggests that it's
doing (at least) a $PARSE on every name.
Zip 3.0 may not relieve the misery, but it
does add a new message or two, like "Scanning
files ...", so that it doesn't look quite so
dead while it's scouting around.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 07:19 AM
09-03-2009 07:19 AM
Re: ZIP performance
We're _really_ going to have a discussion about system performance and application tuning from this application on this AlphaServer 800 box with all of a half-gig of memory and slow SCSI disks? Really?
There are vendors offering a gigabyte of memory (4 DIMMs) for this box for US$120 or so. You've got eight slots in this box. So going from 512 to 2 GB is US$240 or so, and quite possibly less. (I did only a very quick look for prices.)
Faster disks? I've been scrounging disks compatible with this vintage gear for small outlays. Sticking a decent RAID controller and a shelf of disks into the box will get you better I/O speed.
Or punt this box, and replace it with a faster Alpha or (likely cheaper) an Integrity. I've seen some sweet AlphaServer DS10 boxes and even some AlphaServer ES45 boxes for less than US$1000, and Integrity boxes are cheap, too. (Integrity has cheaper OpenVMS licenses.) Or switch from Alpha hardware to one of the available Alpha emulations.
Between the application design and the hardware and the OpenVMS configuration, tuning this box is likely wasting more money than upgrading or (better) replacing the box; even if you get double speed, you're still looking at 12 hours! And if the hardware upgrade or the replacement or migration to Integrity or a switch to emulation is deemed unacceptable or too costly (starting at US$250 or so, and possibly less), then your management has made your decision for you. Let it run for a day or three.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 07:29 AM
09-03-2009 07:29 AM
Re: ZIP performance
"Or punt this box, and replace it with a faster Alpha"
This process is underway, albeit at glacial speeds, to ES47's w/8GB on VMS8.3 . The application support folks are the resource bottleneck. I've had the new clusters built for about two years now!
And FYI, the disk is HSG80 based, not local SCSI so that's a bit "faster".
Cheers,
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 07:44 AM
09-03-2009 07:44 AM
Re: ZIP performance
>> Of the 29 hours involved in this example, I'd say ~26 hours were related to ZIP and about 3 hours to removing them. No, I haven't tried anything other than a
Judging by the high buffered IO I suspect that is it really the directory slowing down ZIP.
If you can RENAME 10,000 file chunks away from this directory, notably in reverse order, and then zip those up, you should see a much better performance.
You have to rename in reverse order to make the rename not take too long.
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 08:02 AM
09-03-2009 08:02 AM
Re: ZIP performance
And unless you can really offload this box of most everything on it, then XFC is going to be running somewhere between shut off and very cache-constrained with what it can gather of the 0.5 GB of memory. And I'm not sure there's anything that's going to help with the comparatively slow I/O path in V7.2-2 and particularly with gazillions of small files.
I don't think you're going to make this box all that much faster through anything you can do in (only) software. Spread the I/O load across spindles, etc. The usual.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 11:05 AM
09-03-2009 11:05 AM
Re: ZIP performance
Would you not then pay nearly the same price for insertion into the new directory that you save during the removal from the old directory?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 11:12 AM
09-03-2009 11:12 AM
Re: ZIP performance
I added a second com to do the delete the files at the end.
Talk to you tomorrow ;-)
Cheers,
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 11:43 AM
09-03-2009 11:43 AM
Re: ZIP performance
> so far.
Yeah, well, all the "/MOVE" file deletion is
done at the end of the job, after the archive
creation is known to be successful, and I
wouldn't expect any non-delete activity to be
affected by the order. So I'm not amazed.
> The job only has the list file open so far
> [...]
It's "Scanning files ...". Zip 3,0 says so:
ALP $ zip3l fred.zip dka0:[utility.source.zip.zip31*...]*.*
Scanning files ................ ......
adding: [.utility.source.zip.zip31a.aosvs] (stored 0%)
[...]
I'd bet that the disk is busy, even with no
open files.
> Are all the files in one directory?
Did we ever get an answer to that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 11:48 AM
09-03-2009 11:48 AM
Re: ZIP performance
The next attempt will be to split the files into 10,000 file chunk subdirectories as suggested by Hein.
Cheers,
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 12:01 PM
09-03-2009 12:01 PM
Re: ZIP performance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 12:10 PM
09-03-2009 12:10 PM
Re: ZIP performance
I might try to restore this mess onto an ES47 / EVA8000, and ZIP it there, but it will still be a several hour restore.
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 12:58 PM
09-03-2009 12:58 PM
Re: ZIP performance
And speaking of restoring, having Zip use a
reverse-sorted file list might help it delete
the files faster, but that will leave you
with a reverse-sorted archive, and if UnZip
were used to restore those files, it would
do it in the archive-member (reverse) order,
which would be the maximum-shuffle method at
that time. It's a conspiracy.
> The 800 5/500 is the "most powerful" box in
> this cluster.
What does it use for memory? I haven't
looked lately, but I once found a whole 2GB
kit for an XP1000 for about $20 on Ebay.
(It's ECC. If it doesn't actually catch
fire, how much damage could it do?)
I can eat up 512MB with only a couple of Web
browsers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 01:24 PM
09-03-2009 01:24 PM
Re: ZIP performance
Jim> Would you not then pay nearly the same price for insertion into the new directory that you save during the removal from the old directory?
The directory shuffle problem multiplies out by its size. And the more you have the more often you have to do it making it a n-square problem.
Adding a new low block to a 10,000 entry directory is 6 times faster then removing the bottom block from a 60,000 entry directory.
Add to that some better caching and there is a good net gain.
For a worse problems I've once scripted a solution with temporary file names renaming them twice from high to low into the final directory.
10,000 may already be too much for the system in question, maybe a 128 block directory should be the cutoff...
But that 128 was only really important in older versions. Around 1999 RMS dropped the 127 block limit when using its directory scans. It should allocate the whole directory in a single big buffer... for searches.
Maybe it is NOT using an RMS search, and just letting the XQP do the name-to-fid mapping.
The XQP still has this somewhat brain-dead 'lookup' table to map the first few file name characters to a vbn. But that does not help much (or any) as the directory gets big and if the files have mostly fixed leading start characters (like MAIL$, or 20090903 :-).
To make room for more pointers, it reduces number of characters from 14 for directories of less than 240 blocks down to 2 for directories over 1200 blocks. Not very selective... when you need it most
Hmmm... We have not been given us any filename examples. Would they all happen to start with a more or less fixed sequence over hundreds of files? Average size? Total directory size?
Can you try making the names shorter, notably more unique in the leading characters?
Turn "MAIL$" into "" (nothing) and "2009" into "9" as exercise?
And uh, what is SYSGEN param ACP_MAXREAD set to?
Art>> MONI MODE shows ~60% Kernel Mode, 10% Interrupt State and 30% Idle Time
So clearly ZIP is not ZIPPING, but preparing. Real zip work would show lots of USER mode.
fwiw,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 02:30 PM
09-03-2009 02:30 PM
Re: ZIP performance
Thank you very much for this - it's not intuitive to me. I imagined that they were roughly equivalent.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2009 02:51 PM
09-03-2009 02:51 PM
Re: ZIP performance
Rename the directory and start a new application directory. Create your Zip archives against the new directory, 5,000 or so files at time with the delete option. When you have sucessfully created your zip files, use DFU to wipe the directory.
Andy