Re: How to compress/"trucate" Prolog 3 files.

Lars Funck Jensen_1 · ‎07-08-2006

I have a large number ( 100.000+) of prolog 3 RMS files, that is no longer updated, but will be accessed readonly.
The files are 500KB - 5 MB is size each and have a rather simpel key structure.

I would like to convert the files, remove all unneeded space and do a 'light' compress of the data/index. The converted files will have VMS security 'R'.
Most of the files are from VAXes and they are moved to an Integrity server, so CPU should not be a problem.

How can I best automate the generation of the needed FDL?

Jan van den Ende · ‎07-08-2006

Lars,

seeing the sheer number of files I understand the need to automate!

My plan would be a procedure that:

1. Get next file to process.

2. ANA/RMS/FDL/OUTP=xxx

3. ANA/RMS/STAT/OUTP=yyy

4. scan yyy for data fill percentage.

5. edit xxx; calculate % found in (4.) times ALLOCATION

( if you need extra performance: calculate AREA_0 and AREA_1 to split up ALLOCATION. The rate is the TOTAL length of ALL keys vs. record length.
Put DATA_AREA as AREA_1 and INDEX_AREA as AREA_0)

6. Convert the file using the constructed FDL.

-- Expect this to be a LENGTHY process!

If you need those files readable during the process, I would advise the READs to use a SEARCHLIST over (new,old) environment, using /SHARE on the convert to create the file with a temporary name in the new location, and renaming that to the real name before proceeding to the next file.

Hein: please add to or change any of the above if you see reason to!

I leave the details of writing (and testing) the procedure.

success, and patience!

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Robert Gezelter · ‎07-08-2006

Lars,

Jan is correct, but I will add my US$ 0.02.

You will also need to use the F$SEARCH lexical function to iterate through the 100,000 files. For every disk (F$SEARCH has no way to wildcard the device name, except if you use a search list in the logical name).

- Bob Gezelter, http://www

Lars Funck Jensen_1 · ‎07-08-2006

I was hoping for a little less editing work!

Or making a standard FDL and for each file just add a few lines to the FDL, based on some info from the existing file ( ana/rms, +?, +?).

I don't need run time performance and as the files are already (used as) readonly I can convert offline.
But I would like each file to be small. This makes it much faster to move all of the files for archiving, backup or restore etc.
100 GB of internal filler space would be nice to eliminate.
A little data compress may not hurt.

PS! For the DCL coding part, I will be fine.

Jan van den Ende · ‎07-09-2006

Lars,

I don't need run time performance and as the files are already (used as) readonly I can convert offline.
But I would like each file to be small

So, data compression and key compression is what should help/
Are the files NOW using compression? Then you might decrease the calculated allocation by another 10 %, and accept any extension if needed.
If you are now NOT using compression, change the FDLs to use compression, and start with allocating 25 %.
Perhaps afterward you might investigate for excessive fragmentation, and cope with that.

hth,

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Hein van den Heuvel · ‎07-09-2006

>> I have a large number ( 100.000+) of prolog 3 RMS files, that is no longer updated, but will be accessed readonly.

As other indicated, automation is in order.

>> The files are 500KB - 5 MB is size each and have a rather simpel key structure.

Do you need the key structure to be retained at all?
If not, just ZIP it up. The RMS indexed file comression options are too limited to be truly effective in general.

How about dropping down to the primary key only

>I would like to convert the files, remove all unneeded space

- Be sure to edit the FDL's with FILL=100 everywhere

- Be sure to edit the FDL's with BUCKETSIZE=63 for
the data aready, and maybe just every where. This will:
--- reduce bucket overhead (duh!)
--- reduce wasted space at the end of each bucket, as records have to fit entirely.
--- avoid the need to reset key compression as start of each bucket.
--- reduce the number of index entries.

- Automate the edit with TPU or PERL.
I have some perl that does largely what you want, and may be able to find time to change it to do exaclty what you need.

> do a 'light' compress of the data/index.

RMS only has a light compress.

You specifically want DATA and KEy compression. Index compression is often less effective, cost more overhead (linear search) and there is simply much less to compress.

>> How can I best automate the generation of the needed FDL?

Perl!.. or DCL.

Jan wrote >>> ( if you need extra performance: calculate AREA_0 and AREA_1 to split up ALLOCATION. The rate is the TOTAL length of ALL keys vs. record length.

For static file there is really no performance benefit to be expected from mutliple areas, and mulitple areas cause mutliple round-ups in allocation and extents.

Jan >> Put DATA_AREA as AREA_1 and INDEX_AREA as AREA_0)

I like that.. but only when exectued to the last detail. It requires disk cluser size control and reduced bucekt sizes. to multiples of 16 or 8: not 63.

Jan>>. -- Expect this to be a LENGTHY process!

Nah.

Jan>> If you need those files readable during the process, I would advise the READs to use a SEARCHLIST over (new,old) environment, using /SHARE on the convert to create the file with a temporary name in the new location, and renaming that to the real name before proceeding to the next file.

If the stuff is life, yeah you need something like that.
For low access, readonly, a backup /ignore=interlock is probaby acceptable, notably if you follow it with ANAL/RMS and/or CONVERT.

>> I was hoping for a little less editing work!

Authomate it!

>> Or making a standard FDL and for each file just add a few lines to the FDL, based on some info from the existing file ( ana/rms, +?, +?).

I have some tools already published that may help:
VMS FREEWARE V5/V6 [RMSTOOLS]
- FDL$GENERATE to quickly generate basic FDL's
- BACKUP_INDEXED_FILE... it does the 63 block bucket, primary key only thing, and call the convert for you!

>> I don't need run time performance and as the files are already (used as) readonly I can convert offline.

Too easy!

>> But I would like each file to be small. This makes it much faster to move all of the files for archiving, backup or restore etc.

>> 100 GB of internal filler space would be nice to eliminate.

Oh yeah! $50 saved :-).

>> A little data compress may not hurt.

RMS only removes strings of more than 6 repeating characters. No bit-patterns like ZIP / LV

Jan >> you might decrease the calculated allocation by another 10 %, and accept any extension if needed.

Yes.

jan> If you are now NOT using compression, change the FDLs to use compression, and start with allocating 25 %.

yes: + extent 5%

>> Perhaps afterward you might investigate for excessive fragmentation, and cope with that.

Start on an empty/unfragmented disk and just run one at a time and all will be fine.

Hth,
Hein.

Hein van den Heuvel · ‎07-09-2006

Lars,

Attached a perl script that i use to bulk-edit FDL files. It already had (too ?)many? options and I added a 'DENSE' option which should largley suit your purpose.

Use it as:

$perl my_fdl_edit.pl DENSE CLEAN < old.fdl > new.fdl

old.fdl would come from FDL$GENERATE or ANAL/RMS

You could use GROW=-25 to further reduce allocation by a percentage (25 in this example).

Help text below.

hth,
Hein.

Usage: perl MY_FDL_EDIT.pl option [options] < old.fdl > new.fd
Version: 4-Jul-2006
Options:

CLEAN Remove miscaleaneous FDL$GENERATE items
NOSIZE Remove ALLOCATION and EXTENTION
DF[m]=nn Data Fill for area [m] or all
IF[m]=nn Index Fill for area [m] or all
ICMIN=nn Minimum Key Size for Index Compression
GROW[m]=nn Percent Change for area [m] allocation
ALIGN Align data area on cluster boundary
EVEN Select bucket sizes witk 8K factors
DENSE[=nn] Minimum space [bucket size - default=64 ]

Hein van den Heuvel · ‎07-09-2006

Sorry, fat fingered the attach.
Trying again.
Hein.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: How to compress/"trucate" Prolog 3 files.

How to compress/"trucate" Prolog 3 files.