1820136 Members
3379 Online
109619 Solutions
New Discussion юеВ

sql loader performance

 
Edgar_10
Frequent Advisor

sql loader performance

Hi,

I have a sql loader performance problem, the window period to complete a load of 210 files(14500 rows each) takes approx.12 hrs.
Any ideas how I could bring the load time down?

The tables being loaded to are indexed; partitioned & NOLOGGING enabled,the load method is conventional.

Thanks in advance.
17 REPLIES 17
Tom Geudens
Honored Contributor

Re: sql loader performance

Hi Dean,
I think we need more information :
- type of system
- type of database
- ...

For example, for oracle I would say to use "direct" load (and rebuild the indexes afterwards).

Regards,
Tom Geudens
A life ? Cool ! Where can I download one of those from ?
twang
Honored Contributor

Re: sql loader performance

If you want to optimize performance for SQL*Loader here are a few things
to consider (for direct and conventional paths):


o Make logical record processing efficient.

- One-to-one mapping of physical to logical records. Avoid continueif
and concatenate.

- Make it easy for the software to figure out physical record
boundaries. That is, use the file processing option string "FIX
nnn" or "VAR". If you use the default (stream mode), on most
platforms (e.g., UNIX) SQL*Loader has to scan each physical record
for the terminating newline character.


o Make field setting efficient.

Field setting is the process of mapping the "fields" in the datafile
to their corresponding columns in the database. The mapping function
is controlled by the description of the fields in the control file.
Field setting is the biggest consumer of CPU time for most loads.

- Avoid delimited fields; use positional fields. If you use
delimited fields, SQL*Loader has to scan the input data looking
for the delimiter(s)/enclosure(s). If you use positional fields,
SQL*Loader just increments a pointer to get to the next field
(very fast).

- If you are using positional fields, avoid trimming white space.
That is, use PRESERVE BLANKS.


Note that a common theme in points 1 and 2 above is to avoid scanning
the input data.


o Make conversions efficient.

There are several conversions that SQL*Loader does for you;
character set conversions and datatype conversions.

- Avoid character set conversions if you can. SQL*Loader supports
three character sets:

a) Client character set (NLS_LANG of the sqlldr process.)
b) Server character set.
c) Datafile character set.

Performance is optimized if all three are the same, most importantly
b) and c). Also, memory for character set conversion buffers is not
allocated if these are the same.

- Avoid multi-byte character sets if you can.

- As for datatype conversions (SQL*Loader datatype to database column
datatype), char to char is efficient if the same character set is in
use for the datafile and the server. That is, no conversion is fast.
Therefore, try to minimize the number of conversions that you have
to do.


o If you can, use the "unrecoverable" option on direct path loads.


o Even for conventional path loads, always run SQL*Loader directly on the server rather than across a network.
.
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Tom,

The server is a Sun server runin Oracle 8.1.7.0.0. I dont want to use direct mode since I have daily partitions & indexes as well as users querying the tables.

Thanks!
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Twang,

Thanks for the input. The data to be loaded is extracted from a CSV file and not all the felds within a row are populated so using positional functionality is not viable. Do you know what the syntax is for specifying "NOLOGGING" for a sql load?

Thanks
Yogeeraj_1
Honored Contributor

Re: sql loader performance

hi,

Sounds like

o archives filled up or
o checkpoint not complete (you can look at the alert log to check those, look for cannot allocate new log)
o there is a unique index on the table and something else is playing with the table -- blocking you (they inserted a row you wanted to)
o the table is locked (is is a child table in a foreign key relationship without indexes on the fkey?)

Are you using locally managed tablespaces?

hope this helps!

regards
Yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Yogeeraj,

Thanks for input. The database is running in "NOACHIVE" mode. The tables have nonunique idexes and there are no foreign keys on tables.

Thanks.
twang
Honored Contributor

Re: sql loader performance

The operations can make use of no-logging mode for SQL*Loader is "direct load":
% sqlldr / load.ctl DIRECT=TRUE

This is a summary of the features and restrictions of the DIRECT PATH option:
-Cannot load clustered tables.

-There can be no active transactions in the loaded tables.

-There can be no SQL functions used in the control file.

-If the target table is indexed, then no SELECT statements may be issued
against the table during the load.

Tom Geudens
Honored Contributor

Re: sql loader performance

Hi again,
A datawarehouse by any chance ;-) ? Ok, I understand the "qualifications". It is however always a tradeoff. If you're not allowed to use the tricks, your users will have to wait longer before having the data available. If you can use the tricks, there'll be a small downtime for your users.

Here's what I would do (having your options). I believe (check this information) you can turn the indexes to "unusable". In that state they are not updated during a load (reducing the impact for the load). Your users will get "hit" too, since their queries will be a lot less performant. After the the loads you rebuild the indexes.

Hope this helps,
Tom Geudens
A life ? Cool ! Where can I download one of those from ?
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Tom,

Thanks for the input, and yes its sort of a data warehouse database that for 1 day could contain approx.20 million records. I will test the load with the idexes disabled and assess load time, despite the impact users will experience.

Regards
Tom Geudens
Honored Contributor

Re: sql loader performance

Ok Dean go for it ... and keep us posted !

Regards,
Tom Geudens

P.S. You might want to take this up with your management / endusers. We did this and introduced SLA's (Service Level Agreements) that allow both sides (not only the enduser-side) to have a say in how things should run.
A life ? Cool ! Where can I download one of those from ?
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Tom,

There was no performance improvement after altering the indexes to UNUSABLE! So what I was drop the indexes and the performance is vastly improved.....Will see how long it takes to load an hours worth of data & time taken to create the index.

Thank!
Yogeeraj_1
Honored Contributor

Re: sql loader performance

hi again,

Concerning the "rebuilding" of indexes:

Note that you can speed up the index rebuild/create via the "parallel" option (In an archive log mode database, the nologging option)


I would not drop the indexes though! Better set them unusable: alter index t_idx2 unusable;

Benchmark to see for yourself..

If I use rebuild (meaning i set them unusable), I cannot lose them. I cannot accidently have one "go missing". People will call and say "hey, I'm getting a strange error about index such and such being unusable -- whats up with that".

If I use DROP / CREATE -- what will happen if the DROP succeeds, but the create fails for whatever reason (script bombs out, out of temp, whatever). Index has gone missing -- no one checks the script log -- and then you spend the next day or two trying to figure out why performance has gone down the tubes ("oh they say the next day, we 'lost' an index").

So, if you use ususable/rebuild -- you'll not lose an index, you'll be told your script bombed, you fix it and the system runs smoothing (except for that minor annoyed person who hit your error)

If you use drop/create -- you'll lose an index someday (maybe many days). You'll not be told your script bombed. Your users will just experience slower response times (if they get the answer back at all), you'll be flooded with calls about "slowness" and then hopefully you can actually track down the index that has gone missing this time.


You may also consider:
- increasing the sort_area_size if not already.

- increasing db_file_multi_block_read_count, if not at OS limits already.

- make sure the data you are reading is spread out across multiple devices (else you might just be introducing massive disk contention).

- speed up your disk access -- db file scattered read is sequential IO (believe it or not).


hope this helps too!
regards
Yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Alexander M. Ermes
Honored Contributor

Re: sql loader performance

Hi there.
Another idea could be like this.
Create a smaller database independent from
your production.
Load the data into it. At this point you can do some checks and correct errors.
Then insert the data into production by using
a sql-script.
Rgds
Alexander M. Ermes
.. and all these memories are going to vanish like tears in the rain! final words from Rutger Hauer in "Blade Runner"
Brian Crabtree
Honored Contributor

Re: sql loader performance

Dean, no points for this.

Yogeeraj,

You will want to be careful with db_multiblock_read_count for data warehouses. We had a large SAP system that had horrible performance after a migration from Informix. We dropped this from 32 to 12, and performance improved considerably. What I didn't realize, is that this parameter deals with the amount of data that can be accessed at a given time, however it has the added effect of forcing the CBO to lower the cost of FTS as well, espcially for small tables. We had always treated it as a "more is better" setting, which probably had detrimental effects in some cases.

Thanks,

Brian
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Yogee,

I attempted disabling the indexes and loading but there was no improvement in the load. How do you suggest I disable&enable the indexes in a script(syntax),because say I have 3 months worth of data and there are daily partitions with indexes?

Thanks in advance!

Yogeeraj_1
Honored Contributor

Re: sql loader performance

hi again,

Brian is right! be cautious with the db_multiblock_read_count ...


As for syntax:
If you had a table created as:
create table p (i integer)
partition by range(i)
(partition p1 values less than (10),
partition p2 values less than (20),
partition p3 values less than (maxvalue));

create index l on p(i) local;
alter index l modify partition p1 unusable;

then:

alter index l modify partition p1 unusable;

alter index l rebuild nologging parallel 6;


Are you using "locally partitioned indexes" or "globally partitioned indexes"? could that be one of the sources of the "problem"?

regards
Yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Edgar_10
Frequent Advisor

Re: sql loader performance

Hi Yogee,

Thanks for the pointers. The partitions are created as locally. I will test the unusable index suggestion and get back to you.

Regards!