Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
cancel
Showing results for 
Search instead for 
Did you mean: 

Gather stats (Oracle) question

SOLVED
Go to solution
TwoProc
Honored Contributor

Gather stats (Oracle) question

Re: Statistics on columnar histograms and the refreshing of histograms via subsequent analyze statements that don't explicity spell out histogram details.

Example scenario:

I create some new buckets for my histograms on a column on a table.

I choose 7 buckets b/c there are 7 distinct values on this column.

> select count(*),a from mytable group by a;
...
...
This gives me 7 rows returned with a non-even distribution.

> analyze table mytable compute statistics for columns a size 7;

I look in user_histograms and my 7 buckets for column are in there.

Here's the issue:

I've already got some jobs that are going to kick off at certain times that will gather the statistics on this table via:

analyze table mytable estimate statistics (with some percentage value)...

Question(s):

Will the above command refresh my histograms too?

How about if I used "compute statistics" instead of estimate?

What if I add the clause "for all indexes" in combination with either of the above compute/estimate commands.

I'm going to test all of this in a test database before this week is out, but I wanted to know if anyone had these answers all down pat already.

I believe that it is implicit that when I gather stats on a table (using either compute or estimate) that the stats are done on all indexes for that table. Is this true for all columnar histograms for that table too?

Or to put it another way, do I HAVE to specify histogram details in subsequent analyze statements to get good statistics on a table column after I've already created the histogram via the first analyze statement?
We are the people our parents warned us about --Jimmy Buffett
11 REPLIES
Volker Borowski
Honored Contributor
Solution

Re: Gather stats (Oracle) question

Hi,

yes, you need to refresh the histograms by submitting the "for coulmns" clause to the analyze statement, otherwise the histogram will not be updated.

In addition, check carefully, if your application is able to use histogramms. It requires some efford if bind-variables are used.

May be an easier way to refresh histograms is to use dbms.stats !
Can use parallel execution and
method_opt= > 'for all columns size repeat'
might be the one you want to use.

Best regards
Volker
TwoProc
Honored Contributor

Re: Gather stats (Oracle) question

Volker, thanks for your reply.

I was looking at the "for all columns size clause" , but the example I had specified a SINGLE size. So I'd have to say that all columns had the SAME histogram size for a table? Or, would it only run it for the SPECIFIED size in the clause, and then I'd repeat the command for each different size histogram for any column in the table?

We do use dbms_stats, but we schedule it from package fnd_stats.gather_schema_stats or package fnd_stats.gather_table_stats. In there, it makes a call to dbms_stats. Or we use the Oracle Apps concurrent job to schedule it (makes it easier to let concurrent manager queues manage it for us).
We are the people our parents warned us about --Jimmy Buffett
TwoProc
Honored Contributor

Re: Gather stats (Oracle) question

Re: bound variables

In the examples that I've read on using the histograms (asktom.oracle.com) - it appears that he is saying that the histograms only come into play for literal comparisons (ditto cursor parameters) and doesn't work for bound variables. Makes sense, because the execution plan has to be made up before the statement is run, and it doesn't seem to know the contents of the bound variables until it actually run (looks like a great opportunity for Oracle to create speed improvements there by preanalyzing the bound vars).

What do you mean that it takes some effort to have it use bound variables? I didn't think it could, unless we started converting key statements to dynamic sql (and maybe that's what you meant by "some effort").

Does the cost optimizer consider histograms for JOIN conditions as in the same way as :bound_variables? I believe the cost optimizer really can't consider histograms as at execution planning time, it doesn't know the contents - so it can't possibly use the histogram, same as bound vars and parameter values.

Your replies appreciated.
We are the people our parents warned us about --Jimmy Buffett
Steven E. Protter
Exalted Contributor

Re: Gather stats (Oracle) question

What about the oracle stats pack?

Or is this manipulating data collected by it?

Perhaps I'm confused.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Yogeeraj_1
Honored Contributor

Re: Gather stats (Oracle) question

hi john,

the CBO will use histograms upon the first hard parse to find "the best overall general plan" for any query.

in some cases, the cbo may not realize that an indexed column was heavily skewed and as such in exceptional cases, you may choose not to use Bind variables.

this can be proven by appropriate examples...

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
TwoProc
Honored Contributor

Re: Gather stats (Oracle) question

Steve,

No, I'm not pulling/manipulating stats packs.

Stats pack is used for analysis of performance after the fact.

These stats are those used by the cost optimizer at query time (well at execution plan time), and how it "knows" about indexes, columns of data, etc. What I'm trying to do is create more histograms when indexes are not being chosen, and they should be. If I can let the cost optimizer know that a column is skewed and has SOME data that should be pulled from an index on occasion (like when a column is being filtered with an string literal or a constant number).

Such as

Select * from mytable
where mytable.cola = 'Y'
and mytable.colb = 'ALT'
and mytable.status =1;

In the above cases cola, colb, and status are not evenenly distributed and even though this table has thousands of rows, the number of truths for test of cola is only 10, colb is only 4 and status is 70.

In this case, the index is highly selective FOR THESE values (not necessarily others), and should be used. If I build the histogram correctly the cost optimizer will choose to use the columnar index to speed up the query.

The question was more about how to keep the statistics of the histograms current through various "analyze/compute statistics" commands.

And, since V. alluded to using them in cases of bound variables, I was asking if there were other uses of them besides in statements where data is filtered with literals (above), and in dynamic sql.

Histograms are great, it's just that they can't be used all the time by the cost optimizer.

Yogeeraj, thanks as always on your Oracle input.
We are the people our parents warned us about --Jimmy Buffett
Volker Borowski
Honored Contributor

Re: Gather stats (Oracle) question

Hi John,

well, "it requires some effort" refers to parameters that (!)can(!) be used to make histograms work on statements with bind-variables. These are "cursor_sharing" and "_optim_peek_user_binds".
But they influence the overall behavior of your database as well, so use care.

In addition there was a CBO bug with histograms. This is roughly translated fron SAP Note 797629:
... up to 9.2.0.5 the was a bug which resulted to favor an index with even a single distinct value by a factor of 100 over (other indexes) if it had (whatever) histogramms on it....

This was used to force the use of certain indexes by creating histograms, but has no more effect in 9.2.0.6 and later.

The "STATUS" selectivity is a quite common and good example for use of histograms, esp. if you have a lot of rows and like to select "active" values out of a lot of "processed" items, because a status is usually not evenly distributed.

On the other hand, if this is a rare and single selection, why bother with administration tasks of histograms at all ?
1) Try to use a hint
2) Analyze this table "normaly" and "set" your own statistic with dbms_stats.set... on the column you need.(I have an example in the office, which I cannot access right now) but for the status-example: If you have i.e. 7 disdinct status values on a 100 Mio record table and you like to favor an index on the status-column, just "set" the statistics for this column to 1 mio distinct values and the optimizer will go for it. Not that elegant, but very efficient :-) and no overhead of histogram administration. Of course this can not be used if this column is needed for other selections as well, which might go the non-selective way..

I would go the "correct" histogram administration way only if I have several tables and statements to be treated this way. Histograms give you additional systemtables that the optimizer needs to consider in all its calculating activity.

Hope this helps
Volker
TwoProc
Honored Contributor

Re: Gather stats (Oracle) question

Volker,

Thanks much for your reply.

I've got Oracle Apps which comes seeded with many histograms on tables, so for the short time, I can assume that the packages which ship with the system for statistics administrations as part of the application take care of the histograms for me. I say that b/c there are about 1000 of them, and I know that in the past when a histogram was created by someone else (no longer here) everyone agreed that it helped the system.

So, if I go after schemas which are part of Oracle Apps, I'm probably OK, but if I go after custom schemas, I've got tackle this part myself.

I'll be asking some folks in the Oracle Apps consulting group if they know for sure.

What I was after was commands to know that custom histograms in custom schemas get updated, and what it took to get it. Of course, I was hoping that it was just the regular analyze command.

Re: the tip on setting the statistic directly; I didn't know that was an option, so I'll be looking it up soon.

As for hinting, I know that I can do that, but that also means I have to find and handle all queries that can use it. If the cost optimizer will pick it up, then it can be used more globally than just what I was working on.

Thanks for the input.
We are the people our parents warned us about --Jimmy Buffett
Jean-Luc Oudart
Honored Contributor

Re: Gather stats (Oracle) question

John,

Could I suggest you go through these documents ?
Note:235101.1 "Oracle Procurement News Volume 3 - May 2003"
Appendix A "TECHNICAL INFORMATION ON HISTOGRAMS"

Note:1031826.6 "Histograms: An Overview"
=> When to Use Histograms
=> When to Not Use Histograms
=> How to Use Histograms

Note:72539.1 " Interpreting Histogram Information "

Regards
Jean-Luc
fiat lux
Volker Borowski
Honored Contributor

Re: Gather stats (Oracle) question

Well,

"setting" the statistcs helps a lot if you have a QA-System with less data than production and you try to track down execution plans on PRD with less data in QA. You select the PRD stats and set them in QA and have a better chance that the behavior is more equal. But you can not set all what influences the optimizer i.e. the index clustering factor might show the optimizer that someone is playing tricks.

If you google around a bit for optimizer and "set statistics" stuff, you'll sure find some presentation-slide of performance-consultants with examples. This is where I got mine (But as said, I am out of office for a while).

I would not go this way (setting stats) in general in production, may be just in single selected cases.

Your described effect that "histogramms helped" might be related to the bug I mentioned! Was that below 9.2.0.6 ?
I doubt that a general calculation of histograms on all tables is of help. They cost in terms of parsing time, because they have to be considered, and select-performance on several systemtables and they are of use only in some special cases, so I would not go and calculate histograms on the entire DB.

Any way good luck in hunting "gets" and "reads"
Volker
TwoProc
Honored Contributor

Re: Gather stats (Oracle) question

Well, after playing with this for some time, I 've found a handful of good queries that have responded well to some histograms, and testing has proven that as Volker said - I need to recalculate my statistics with column specifics to refresh histograms properly, and have put them in jobs to get it done on a regular basis(where needed).

Thanks all for the help, it is much appreciated.
We are the people our parents warned us about --Jimmy Buffett