- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Determine the actual size of data in an indexe...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2007 04:16 AM
06-13-2007 04:16 AM
I have an indexed that is converted weekly for space purpose. The convert command file does the ANALYZE/RMS/FDL to generate a FDL file and then optimizes the FDL file with the EDIT/FDL/ ANALYSIS=<>/NONINTERACTIVE <>. It seems that the generated FDL in this process is not the best. Especially, I have good reason to suspect that the initial allocation for areas is far too big. My Question is how can I best determine what is the actual space needed for the initial allocation for this file, how can I determine the actual space the data records and the key occupy inside the file. If I do ANALYZE/RMS/STATI on the file what should I notice in particular? Not much data is added to the file, and no extent seemed to happen recently.
Thanks for the answers.
The FDL produced in the above way looks like:
SYSTEM
SOURCE "OpenVMS"
FILE
CONTIGUOUS no
FILE_MONITORING no
GLOBAL_BUFFER_COUNT 0
NAME "NIGHT$WEEK:[CONVERT]CRDAT.TMP;1"
ORGANIZATION indexed
OWNER
PROTECTION (system:RWED, owner:RWED, group:RE, world:)
RECORD
BLOCK_SPAN yes
CARRIAGE_CONTROL carriage_return
FORMAT fixed
SIZE 224
AREA 0
ALLOCATION 2176168
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 12
EXTENSION 65535
AREA 1
ALLOCATION 34008
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 12
EXTENSION 8508
KEY 0
CHANGES no
DATA_AREA 0
DATA_FILL 75
DATA_KEY_COMPRESSION no
DATA_RECORD_COMPRESSION no
DUPLICATES no
INDEX_AREA 1
INDEX_COMPRESSION no
INDEX_FILL 75
LEVEL1_INDEX_AREA 1
NAME "key1"
NULL_KEY no
PROLOG 1
SEG0_LENGTH 66
SEG0_POSITION 0
TYPE string
Note how the *_COMPRESSION attributes for the key have "no"
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2007 05:00 AM
06-13-2007 05:00 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
if your using this example of fdl
you can take out the allocations. What happens if you just do a simple convert
file1 file2? It will take out empty pointers, bucket splits etc. I have an optimizer script that I can't lay my hands on at the moment. You'll be pleny of replies, theres some pretty handy rms
folk in here! hope this helps -Dean
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2007 01:53 PM
06-13-2007 01:53 PM
Solution> EDIT/FDL/ ANALYSIS=<>/NONINTERACTIVE <>.
"Optimize" in this context is a misnomer. In order to optimize anything you need to know what aspect you wish to optimize. There's no such thing as "optimal for everything". Generally speaking, improving one thing requires something else be sacrificed. The automated FDL tool tries to balance everything out - in some senses the exact opposite of "optimize".
If you know what you want to optimize, you should create an FDL once and use it each time you CONVERT.
From your description it sounds like you want to optimize for space? That is, you want the file to be as small as possible, even if that means sacrificing performance?
Remember that your FDL need only mention the things you're interested in. To minimize space, turn on compression, choose higher bucket sizes with larger fill factors - this will reduce the amount of empty space left over in buckets. However, it will mean that inserting records will tend to be more expensive, as there's a higher chance of a bucket split (100% if fill factor is 100!). Lookups may also cost more as there will be more sequential searching within buckets. Compression also takes more CPU to compress and decompress data (but in some cases that may be more than compensated for by reduced I/O). The other catch with compression, is some data may not compress well.
To test the extreme case, try this FDL:
AREA 1
BUCKET_SIZE 63
DATA_FILL 100
INDEX_FILL 100
DATA_KEY_COMPRESSION yes
DATA_RECORD_COMPRESSION yes
INDEX_COMPRESSION yes
Now CONVERT your file and look at the size of the resulting file. If you have some kind of benchmark for file performance compare this file with your existing file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-13-2007 02:00 PM
06-13-2007 02:00 PM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
Hmmm... then why ask the system to leave 1/4 of every single databucket empty (75% fill) and not use data key compression.So do you still 'gain' space with the comvert? That would suggest the application is causing excessive bucket splits probably due to localized non-eof inserts.
>> The convert command file does the ANALYZE/RMS/FDL to generate a FDL file and
What a HORRIBLE waste of production hours!
Surely you can predict next weeks record counts from last weeks oonvert stats?
Just tweak the 'data record count' number in the analysis section to accomodate next beek needs. underestimate if you want to make sure not to overallocate for some bizarre reason.
>> then optimizes the FDL file with the EDIT/FDL/ ANALYSIS=<>/NONINTERACTIVE <>. It seems that the generated FDL in this process is not the best. Especially, I have good reason to suspect that the initial allocation for areas is far too big.
Share those reasons!
>> My Question is how can I best determine what is the actual space needed for the initial allocation for this file, how can I determine the actual space the data records and the key occupy inside the file.
Just carefully interpret the record counts, mean length and compression stats in the analys section of the anal/rms fdl file.
>>> If I do ANALYZE/RMS/STATI on the file what should I notice in particular?
As per above... too bad you posted the post edit/fdl output, not the file with analyze data. Please attach as a text file if further help is needed.
>> Not much data is added to the file, and no extent seemed to happen recently.
That's goodness. Be happy! Don't bother generating new FDL's until CONVERT/STAT tell you the number of records increased a bunch
>> The FDL produced in the above way looks like:
Wrong FDL file!
Please check my RMS_TOOLS freeware contribution as:
http://h71000.www7.hp.com/freeware/freeware60/rms_tools/
It has an XLS file to help you ocunt, and a 'tune_check' program
( see its -s=-1 -a option) as will as a silly but effective .BAS count tool.
So why no data key compression?
Just do a sample convert with that enabled and see whether it is effective for the particular data distribution in use
Hope this helps,
Hein van den Heuvel ( @ gmail . com )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2007 08:00 AM
06-14-2007 08:00 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
Thank you very much.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2007 10:16 AM
06-14-2007 10:16 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
Anyway, You are leaving us in suspense here!
Show us that analyze data! (or Email if you think it inappropriate for the forum)
Also, I would recommend you create an fdl along the lines John suggested, or just use your old FDL but remove the allocation and mark data key compression as YES.
Then do a CONVER/NOfast/FDL[/SHARE] file tmp.tmp.
After a minute, abort it (^Y) and do the ANAL/RMS/FDL on the tmp output file.
Now look at that FDL to see the [in]effectiveness of data key compression.
Minus is bad, Single digit positive numbers would make it unimportant and double digit number should strongly suggest to enable data key compression.
Cheers,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2007 06:25 AM
06-19-2007 06:25 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
Sorry I left it suspended. I did the ANALYZE/RMS/STATIST on the file and I am pasting here the important part:
STATISTICS FOR KEY #0
Number of Index Levels: 3
Count of Level 1 Records: 172366
Mean Length of Index Entry: 70
Count of Index Blocks: 31812
Mean Index Bucket Fill: 75%
Count of Data Records: 3458883
Mean Length of Data Record: 224
Count of Data Blocks: 2068392
Mean Data Bucket Fill: 75%
Overall Space Efficiency: 68%
This is 1 month after the last CONVERT. We can see that the file is actually empty in 25% , the buckets remained filled in only 75% after 1 month so records are almost not added here. What will happen if I CONVERT without specifying the ALLOCATION attribute for the AREAs? That will create an output file in the size of the actual data size plus the *FILL percentage (and plus the index structure, control bytes, etc.), I guess, but the file will be highly fragmented without specifying initial allocation, how can I avoid this fragmentation? By COPY/CONTIG after the CONVERT?
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2007 08:23 AM
06-19-2007 08:23 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
and INDEX_FILL 75 in the KEY 0 section.
Make the 100% for data, 90% for index for the next convert and see how you like it.
Gotta run (battery low)
Maybe more later,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2007 08:33 AM
06-19-2007 08:33 AM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
you can take out the allocation. as john said, leave in only what your interested in. put in Heins suggestions to, try it
and see if you like it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2007 12:03 PM
06-19-2007 12:03 PM
			
				
					
						
							Re: Determine the actual size of data in an indexed file
						
					
					
				
			
		
	
			
	
	
	
	
	
>Mean Index Bucket Fill: 75%
...
>Mean Data Bucket Fill: 75%
Excellent! This is exactly what you asked for. If you want 100% fill, just say so. As Hein suggests, leaving index fill at 90% shouldn't cost much, and it will protect you from too many index bucket splits if you add new records.
>if I CONVERT without specifying the
>ALLOCATION attribute for the AREAs? That
>will create an output file in the size of...
Yes, it will give minimal size, potentially fragmented into EXTENSION sized chunks (or whatever the default chain determines).
Instead of leaving ALLOCATION and EXTENSION blank, use your knowledge of the file to give an estimate of what you expect the final size to be. You know the current size, and you know "Overall Space Efficiency: 68%". Thus, if you want absolute minimal space, and also minimising fragmentation, make the allocation 60% of the current size, and set the extension to (say) 10% of the allocation.
Check the RMS stats after your next CONVERT and adjust the numbers accordingly.
