- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Density nonsense
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 02:31 AM
10-24-2005 02:31 AM
Density nonsense
On DSS, large table 840 000 000 records.
I know that there is 2 distinct value for the field BUSINESS_UNIT reparted on a 93% / 7% basis.
Histograms have been calculated with
execute dbms_stats.gather_table_stats(
ownname => user,
tabname => 'T1',
granularity=>'GLOBAL',
estimate_percent => 0.05,
block_sample => TRUE,
method_opt => 'FOR COLUMNS BUSINESS_UNIT SIZE 2',
degree => 6);
select TABLE_NAME,COLUMN_NAME,DENSITY
from USER_TAB_COL_STATISTICS
where table_name='T1'
and column_name='BUSINESS_UNIT';
TA COLUMN_NAME DENSITY
-- ------------------------------ ------------------------------
T1 BUSINESS_UNIT .000000000616598840794179
>>>>>>>>>>>>> density * num_rows
SQL> select 6.1660E-10 * 842276500 from dual;
6.1660E-10*842276500
-------------------------
.5193476899
Metalink :
The density is expressed as a decimal number between 0 and 1.
Values close to 1 indicate that this column is unselective
Values close to 0 indicate that this column is highly selective
The more selective a column, the less rows are likely to be returned by a query referencing this column in its predicate list.
I don't get it ?!?
The field seems to be very selective according to this statistic, but in fact, it is not. Giving a certain value, user get 97% of all records !
SQL> select *
from user_tab_histograms
where table_name='T1'
and column_name='BUSINESS_UNIT';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU
------------------------------ --------------- --------------- -------------- --------------------
T1 BUSINESS_UNIT 7269 3.6963E+35
TMEURST BUSINESS_UNIT 8109 3.8002E+35
Any insight, guess, idea welcomed.
Cheers
Nicolas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 02:38 AM
10-24-2005 02:38 AM
Re: Density nonsense
Rod Hills
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 02:44 AM
10-24-2005 02:44 AM
Re: Density nonsense
Welcome back !
As you only have 2 values, why not using bitmap index ?
Regards
Jean-Luc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:05 AM
10-24-2005 03:05 AM
Re: Density nonsense
The table is compressed. In order to get the best compression we have to sort the staging table before making direct path insert.
Index maintenance is a cost I'm not willing to pay if it can be avoided (unusable + DBMS_PCLXUTIL is not an option).
1) There 4 other indexes on table
2) One specific BUSINESS_UNIT represents 97% of the data ... and 97% of the query ;) So the index wouldn't be of much help here.
--> how do you interpret the statistics ? That was my point.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:09 AM
10-24-2005 03:09 AM
Re: Density nonsense
This is a database question in a the "Databases category" thread ... so what's with you ???
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:12 AM
10-24-2005 03:12 AM
Re: Density nonsense
"I know that there is 2 distinct value for the field BUSINESS_UNIT reparted on a 93% / 7% basis."
So you have 840,000,000 * 0.93 = 781,200,000 rows with one value and 840,000,000 * 0.07 = 58,800,000 rows with another value?
Maybe you should consider creating another table (MT1?) only with those 2 rows that would be referenced by by a table T1 Foreign Key:
create table MT1 (
MT1_ID NUMBER NOT NULL,
BUSINESS_UNIT ...)
You'll need to recreate table T1 (saving all rows to a T1_TEMP temporary table, re-creating T1 and re-loading records from T1_TEMP with MT1_ID but without the BUSINESS_UNIT)...
That's just an ideia...
Best Regards,
Eric Antunes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:26 AM
10-24-2005 03:26 AM
Re: Density nonsense
Thanks for you proposition but I fail to understand the point a rebuild the table ... beside loosing a saturday ;)
Again, my question is a on the interpretation of the Oracle stat.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:43 AM
10-24-2005 03:43 AM
Re: Density nonsense
My point is the same as Oracle point when they created those 2 tables (for example) for Inventory:
MTL_MATERIAL_TRANSACTIONS
MTL_SECONDARY_INVENTORIES
When I want to know all the transactions of one particular inventory it is much more easy and responsive than if I hadn't the MTL_SECONDARY_INVENTORIES table and respective indexes...
Best Regards,
Eric Antunes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 03:52 AM
10-24-2005 03:52 AM
Re: Density nonsense
Regarding the sample (0.05%) would you really expect getting relevant information back from Oracle ?
Regards
Jean-Luc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 04:01 AM
10-24-2005 04:01 AM
Re: Density nonsense
SQL> select blocks * 0.05 from tabs where table_name='T1';
BLOCKS*0.05
-----------
2657667
This is 20 GB sample, not so bad ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 04:14 AM
10-24-2005 04:14 AM
Re: Density nonsense
During batch, the FK would be most painful, and again, what could possibly be the gain ?
Referencial constraint serve no purpose here.
The index on a 2 value table neither.
Thanks for your help anyway.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 04:19 AM
10-24-2005 04:19 AM
Re: Density nonsense
But compared to the overall size of your table, you wonder about the density "nonsense", I would think that the sample size speaks for itself.
The overall distribution is 93% / 7% for 840M records. But 420,000 would you expect this distribution to be uniform ?
From the overview on histograms :
When to Not Use Histograms
--------------------------
Also, be aware that histograms, as well as all optimizer statistics, are
static. If the data distribution of a column changes frequently, it is
necessary to recompute the histogram for a given column. Histograms are not
useful for columns with the following characteristics:
o all predicates on the column use bind variables
o the column data is uniformly distributed
o the column is not used in WHERE clauses of queries
o the column is unique and is used only with equality predicates
Regards
Jean-Luc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2005 07:40 PM
10-24-2005 07:40 PM
Re: Density nonsense
For what I read in the statistics in 20 GB data sample, Oracle have found 0.51 record with a specific value and that is a nonesense.
And the histograms is studpid as weel with value of a such magnitude 10E35 !
Furthermore, I have 4 others columns with the same kind of crap ...
Thanks all for your post.