1753767 Members
5816 Online
108799 Solutions
New Discussion юеВ

Re: ZIP performance

 
SOLVED
Go to solution
Steven Schweda
Honored Contributor

Re: ZIP performance

> [...] VMS v7.2-2 [...]

> Just the directory searches on a directory
> of that size would likely be expensive.

The FAQ suggests that V7.2 is new enough to
get the boost to the directory cache stuff,
so a VMS upgrade might not help, either.

A quick look at the code suggests that it's
doing (at least) a $PARSE on every name.

Zip 3.0 may not relieve the misery, but it
does add a new message or two, like "Scanning
files ...", so that it doesn't look quite so
dead while it's scouting around.
Hoff
Honored Contributor

Re: ZIP performance

You're serious? You're not punking us?

We're _really_ going to have a discussion about system performance and application tuning from this application on this AlphaServer 800 box with all of a half-gig of memory and slow SCSI disks? Really?

There are vendors offering a gigabyte of memory (4 DIMMs) for this box for US$120 or so. You've got eight slots in this box. So going from 512 to 2 GB is US$240 or so, and quite possibly less. (I did only a very quick look for prices.)

Faster disks? I've been scrounging disks compatible with this vintage gear for small outlays. Sticking a decent RAID controller and a shelf of disks into the box will get you better I/O speed.

Or punt this box, and replace it with a faster Alpha or (likely cheaper) an Integrity. I've seen some sweet AlphaServer DS10 boxes and even some AlphaServer ES45 boxes for less than US$1000, and Integrity boxes are cheap, too. (Integrity has cheaper OpenVMS licenses.) Or switch from Alpha hardware to one of the available Alpha emulations.

Between the application design and the hardware and the OpenVMS configuration, tuning this box is likely wasting more money than upgrading or (better) replacing the box; even if you get double speed, you're still looking at 12 hours! And if the hardware upgrade or the replacement or migration to Integrity or a switch to emulation is deemed unacceptable or too costly (starting at US$250 or so, and possibly less), then your management has made your decision for you. Let it run for a day or three.
Art Wiens
Respected Contributor

Re: ZIP performance

I'm not asking for a tuning exercise on this vastly under-powered box, just if there's a "better" way to use ZIP than I am. Believe me I feel the pain of how "expensive" lookups are from having a bazillion files in a directory. Trying to sort the initial mess into year subdirectories has been enormous.

"Or punt this box, and replace it with a faster Alpha"

This process is underway, albeit at glacial speeds, to ES47's w/8GB on VMS8.3 . The application support folks are the resource bottleneck. I've had the new clusters built for about two years now!

And FYI, the disk is HSG80 based, not local SCSI so that's a bit "faster".

Cheers,
Art
Hein van den Heuvel
Honored Contributor

Re: ZIP performance

Hello Art,

>> Of the 29 hours involved in this example, I'd say ~26 hours were related to ZIP and about 3 hours to removing them. No, I haven't tried anything other than a

Judging by the high buffered IO I suspect that is it really the directory slowing down ZIP.

If you can RENAME 10,000 file chunks away from this directory, notably in reverse order, and then zip those up, you should see a much better performance.

You have to rename in reverse order to make the rename not take too long.

Hein
Hoff
Honored Contributor

Re: ZIP performance

A good RAID controller (which can be had for around US$50) can do very well against an HSG80 box; I'd here guess you'd see somewhere between 7 MBps and maybe as much as 25 or 30 MBps, depending on the speeds of the disks behind the HSG80. Various SCSI Ultra320 RAID controllers claim 320 MBps; you'll see a chunk of that, but not all of it.

And unless you can really offload this box of most everything on it, then XFC is going to be running somewhere between shut off and very cache-constrained with what it can gather of the 0.5 GB of memory. And I'm not sure there's anything that's going to help with the comparatively slow I/O path in V7.2-2 and particularly with gazillions of small files.

I don't think you're going to make this box all that much faster through anything you can do in (only) software. Spread the I/O load across spindles, etc. The usual.
Jim_McKinney
Honored Contributor

Re: ZIP performance

> You have to rename in reverse order to make the rename not take too long.


Would you not then pay nearly the same price for insertion into the new directory that you save during the removal from the old directory?
Art Wiens
Respected Contributor

Re: ZIP performance

Just fired off another one with a reverse sorted list (59,889 files) and no /MOVE switch ... it appears to be behaving similarily so far. MONI MODE shows ~60% Kernel Mode, 10% Interrupt State and 30% Idle Time (it's a batch job running at prio 3 gotta leave some cycles for users ;-). The job only has the list file open so far(and the command proc and batch log file).

I added a second com to do the delete the files at the end.

Talk to you tomorrow ;-)

Cheers,
Art
Steven Schweda
Honored Contributor

Re: ZIP performance

> [...] it appears to be behaving similarily
> so far.

Yeah, well, all the "/MOVE" file deletion is
done at the end of the job, after the archive
creation is known to be successful, and I
wouldn't expect any non-delete activity to be
affected by the order. So I'm not amazed.

> The job only has the list file open so far
> [...]

It's "Scanning files ...". Zip 3,0 says so:

ALP $ zip3l fred.zip dka0:[utility.source.zip.zip31*...]*.*
Scanning files ................ ......
adding: [.utility.source.zip.zip31a.aosvs] (stored 0%)
[...]

I'd bet that the disk is busy, even with no
open files.

> Are all the files in one directory?

Did we ever get an answer to that?
Art Wiens
Respected Contributor

Re: ZIP performance

I didn't expect any difference up front without the /MOVE, but "hoping" for some relief with the reverse sorted list ... none so far but it's early in the game. Yes all the files are in one directory.

The next attempt will be to split the files into 10,000 file chunk subdirectories as suggested by Hein.

Cheers,
Art
Hoff
Honored Contributor

Re: ZIP performance

With the files involved stored out on a (fossil) HSG80, find a (less-fossil) AlphaServer box with some spare cycles, MOUNT the disk, and run your zip from there. It'll still hammer on the disk, but...