1752815 Members
5906 Online
108789 Solutions
New Discussion юеВ

File sizing question

 
SOLVED
Go to solution
Hein van den Heuvel
Honored Contributor

Re: File sizing question

>>> I found one reason: bucket size: the original file had a bucket size of 35 on each eare and the new file a bucket size of 3....

Ah! That'll get you. Could that have been caused by and edit/fdl/noint on input data from a small test file? If so, then you just want to change the record count in the input and re-run.

35 is a somewhat suspect number also. It could be just rigth of course. It could also be an 'accident' due to a cluster size problem with edit/fdl/nointer. It could have been based on old VCC IO size cut-off. Worth reviewing!

Thanks for the follow up though. Much clearer requirement now. It brought out that 'history first' suggestion.
I like that a lot, and you may want to try to put a safe and sound procedure in place with that.

The reason I like it a lot is because it will
- creates a much cleaner file
- give you a much more confidence that the final file will be right, that it will even fit!
- has the potential of reducing the down time further.
- reduces the total resources significantly.

My cut at Bob's approach:

- make the conversion program read the original file sequentially (with RRL+NLK or better still the new NQL option), no need for extra buffers as indexed files sequential gets just read a bucket at a time.
No need to move teh record from teh indexed file to a sequential file first.
- make the conversion program output to a quential file with large (120 or 96 or so) block size, WBH and 2 buffers.
No, scratch that.
Make it write to an unshared or with DFW to an indexed file with bucket size 63, all the compression and just a primary key on a temp disk. Just a few (10 - 20 ) buffers will do.
- convert the transformed file to the target location.

- create a modified transform program that can compare the original with existing file.
o read old, transform
o compare transformed with corresponding record in new file.
oo if equal (99%) NEXT record
oo if non-existant, $PUT
oo if different $UPDATE

- the comparison will be fast, as both file are in the same order.
- updating is generally easier than creating a fresh record.
- inserting will be easier as teh index structures will be 99% in place.
- inserting is LIKELY at the 'far right', which uis easy.

In the 'new data first' plan, with back-filling the old those inserts will be out of order causing bucket splits all the time, in aereas of the file where dense load is preferred.


finally.... as a general observation, not for the purpose of this exercise....
Would it not be great if some applications could partition their data into an old and new range. Either with a fix cutoff (date, item-number) of a dynamic (look in new first, if not found look in old). A little bit liek VMS searchlist works: create new files in the first directory, but look for older files in all directories from the list.
Doing so, you could 'freeze' the bulk of your data (90%? 99%?). That chunk would not need to be backed up any more, nor converted,
or at a much lower rate. (AI-journalling could further reduce the need for re-backup).
The new data, which needs more frequent backups and converts would be much smaller, easier to deal with.

Cheers,
Hein.
Robert Gezelter
Honored Contributor

Re: File sizing question

Hein,

A footnote on your recent comment. I have seen cases, particularly in cases such as this, which are extremely IO bound, where increasing BUFFERCOUNT and BUFFERSIZE substantially significantly improves throughput, hence my recommendation.

Even using files with large extends, I have seen this effect.

- Bob Gezelter, http://www.rlgsc.com
Willem Grooters
Honored Contributor

Re: File sizing question

Suggestions taken:
* Create FDL of file, take a good look to area sizing.
* Update sizing parameters of the new file according these numbers
* Convert/FDL of file.

This did the trick. Convert could be measured in minutes in stead of hours.

The recommendation has been added to the manuals.

Willem Grooters
OpenVMS Developer & System Manager