Showing results for 
Search instead for 
Did you mean: 

How to find out Required Space

Frequent Advisor

How to find out Required Space

We are going to insert 5.75 million records.
How to estimate an approximate space required for these no of records.

We are using Oracle 8i.

A. Clay Stephenson
Acclaimed Contributor

Re: How to find out Required Space

Well, 5.75 X 10^^6 X Record_Size gets you close but then you have to add overhead for any indices. Index use is a very tricky calculation and could even depend upon the order of record insertion because of the way the B-trees might balance. If you have several indices, the index data usage could actually exceed that of base data.

The best way to answer your question is to do a test insertion of perhaps 5% of your data and measure before and after. Delete the test and then do the real thing.

If it ain't broke, I can fix that.
Jean-Luc Oudart
Honored Contributor

Re: How to find out Required Space

One could say "it depends..."

You can find some very interesting formula on the web (oracle doc...)

My apporach would be to upload 100,000 rows and extrapolate.

Will you be using SQL*Loader ?

fiat lux
Frequent Advisor

Re: How to find out Required Space

Is that possible i can find out the size of a particuler table before and after insertion rather than size of table space (cos some other users also mite be inserting to other tables at same time).

For data and indexes we use 2 seperate tablespaces.So how to handle this?

Jean-Luc Oudart
Honored Contributor

Re: How to find out Required Space

try this

set echo off
set serveroutput on
set verify off
accept v_user prompt 'Schema: '
accept v_object prompt 'Object: '
accept v_object_type prompt 'Object Type: '
VAR1 number;
VAR2 number;
VAR3 number;
VAR4 number;
VAR5 number;
VAR6 number;
VAR7 number;
VAR8 number;
vtable varchar2(30);

VAR2 := (VAR2)/1024/1024;
VAR4 := (VAR4)/1024/1024;
VAR8 := (VAR2-VAR4);
dbms_output.put_line('OBJECT_NAME = '|| upper('&v_object'));
dbms_output.put_line('TOTAL_BLOCKS = '||VAR1);
dbms_output.put_line('TOTAL_MBYTES = '||VAR2);
dbms_output.put_line('UNUSED_BLOCKS = '||VAR3);
dbms_output.put_line('UNUSED_MBYTES = '||VAR4);
dbms_output.put_line('LAST_USED_EXTENT_FILE_ID = '||VAR5);
dbms_output.put_line('LAST_USED_EXTENT_BLOCK_ID = '||VAR6);
dbms_output.put_line('LAST_USED_BLOCK = '||VAR7);
dbms_output.put_line('TOTAL_USED_MBYTES = '||VAR8);
set verify on
set echo on
fiat lux
Frequent Advisor

Re: How to find out Required Space

This is the result.

TOTAL_MBYTES = 771.11328125
UNUSED_MBYTES = 39.23828125
Jean-Luc Oudart
Honored Contributor

Re: How to find out Required Space

Therefore if you run the script for all required objects before and after you will know the space used from this specific upload (assuming nobody else is updating the same table(s)).
If I undertand you want to know what is the space required for the full operation , I would suggest you run a small one (100,000) in a test environment with these scripts run before/after.

If you can then run the full upload in this test environment, the better.
You will know the space usage definition for this process and also an idea for the elapsed time.

fiat lux
Volker Borowski
Honored Contributor

Re: How to find out Required Space


this sounds big.
Several things to account:
- data is stored in blocks
- index-data is stored in blocks
- each block has block-overhead (header+itl-slots)
- each block has freespace to serve updates
- each block has row overhead per row (block-row-directory)

Depending on your application needs, it might be worth to fill a block initially only by 60%-80% to avoid loss of performance for chained/migrated rows during later updates. This will cost space, that you do not need initially, but will populate later.

On the other hand, if you never ever update any record in this table, so it never will grow, if you reserve 20% of space for data-updates you will have 20% loss of performance by design, because you can only read and write block-wise.

Check the ORACLE-Documentation for the BLOCK-design, understand PCT_FREE / PCT_USED / INITRANS and calculate your min. / max. / average row length.

I just designed an EXCEL-Sheet to calculate this, but I have no access to it today. I'll try to attach it later.

If you have a single thread, insert only, never update table, you could go for PCT_FREE=0 and initrans=1 for best and most effective use of blocks.

If you use 8k block size and expect an average row of 500 Bytes to grow up to 600 Bytes, you should set PCT_FREE ~ 17%. (Roughly calculated this way:
~ 7900 Byte for data / 600 Byte = 13 rows will fit into this block when fully updated
(Be aware, that 7900 available bytes for data may differ depending on the value of INITRANS and the number of fields in the table-structure)
13 rows * 500 bytes initial size = 6500 bytes sould be insertable therfore into this block before it goes of the free-list.

6500 / 7900 gives you a ratio of 82% used space. The next insert to this block should be avoided, because otherwise you will have some children leaving home upon later updates. So PCT_FREE = 17 will pull this block of the freelist and give row 14 a new block, and leave enough space for your update to 600 byte in each row.

If you plan for massive parallel DML processing, go for a higher INITRANS value to avoid DEADLOCKs. If you increase INITRANS (24Byte for each slot), keep in mind that your overall space in the block becomes smaller so 20% of PCT_FREE might not be enough (or too big) any more (depending on if the space usage either kicks an additional row out of the block or not).

Index block calculation is roughly the same but just for the index fields and a bigger overhead for the index-tree-administration.

OK this entire stuff is completely theoretically, because in real life you always have a various mix of rowlength and may be structure changes or later added parallel processing. In this case you might not get away with later adjustment and reorganisation.
But somehow you have to start.

Usually nobody cares about block design. Usually it does not make sense for each table just because of the data involved. If a 1MB table is stored with 10% freespace by bad design, who cares.
If you store a 1TB table with 10% freespace by bad design, you surely waste 100 GB of expensive Storage-Disk-Memory which might be mirrored and RAIDed and whatever.

Hope this helps