https:  ## Message Posted: Wed, 25 Apr 2012 @ 13:57:56 GMT  < Last>>

 Subj: Re: Mutli-Column Stats From: Tewksbury, Kevin

This question was posted some time ago on the masters list and followed up by the response from Carrie. I follow these guidelines put forth from Carrie. I re-read this all of the time to make sure that I get the best stats possible. Hope this helps.

KRT

----------------------------------------

Subject: masters-team Interesting questions on COLLECT STATS

Hi There,

I need some important inputs on the way we collect statistics. Below is the example

I have 4 tables, each with a single column Primary Index (assume all have same data volume ~ 5 million rows). Below are the tables and their PIs

```     Table A (PI=x1)
Table B (PI=y1)
Table C (PI=y1)
Table D (PI=x2)
```

And, here is the query

```     Select A.x1..., B..., C...., D...
from
A, B, C, D
where
A.x1=b.x1
and
A.x2=D.x2
and
A.x3=100
and
A.x4='BMW'
and
B.y1=C.y1
and
b.y2 in ('SFO','LAX')
and
C.z1 = 11
and
C.z2 = '*'
and
D.r1 <> 'PAK'
and
D.r2 = '\$135000';
```

Now, the queries follows

1. In the above query, what is the best way to collect stats? Is it on all columns individually? Like..

```     STATS ON A (x1)
STATS ON A (x2)
STATS ON A (x3) ....
```

2. Is it good to collect stats on combination of joining columns and filter columns separately as given below?

Assume, we have stats on PI already

```     STATS ON A (x1,x2)
STATS ON A (x3,x4)
STATS ON B (y2)
STATS ON C (z1,z2)
STATS on D (r1,r2)
STATS on D (x2) ...
```

3. Or, combination of all columns of a table is good?

```     STATS ON A (x1,x2,x3,x4)
STATS ON B (y1,y2)
STATS on C (z1,z2,y1)
STATS on D (x2,r1,r2)
```

4. Since there is an inequality condition on D.r1 (D.r1 <> 'PAK'), how the stats should be?

5. Incase conditions the columns D.r1 and D.r2 are ORed together, then how the stats should be?

```     ...
and
(
D.r1 <> 'PAK'
OR
D.r2 = '\$135000'
)
```

In short, Is it always good to collect stats on individual columns or combination of columns based on joining and filter conditions? In case of later, what is the best way of grouping the columns?

Regards,

Hari

-------------------------------

Carrie's response to this:

collecting stats - multi column or not

I'd suggest you use the option #2 below, for most, but not all of the examples given. See immediatley below for comments:

2. Is it good to collect stats on combination of joining columns and filter columns separately as given below?

Assume, we have stats on PI already

STATS ON A (x1,x2) -- Don't collect. Multi-column stat won't be used in this case, each column is used in a different join

STATS ON A (x3,x4) -- Yes, both are selection columns in equality condition, multi-column stats are good for correlation between the two

STATS ON B (y2) -- Yes, represents a single column constraint for table B. No multi-column advantage

STATS ON C (z1,z2) -- Yes, both columns expressed in equal condition in the query, multi-column stat will be used.

STATS on D (r1,r2) -- Don't collect. Column r1 used in non-equal condition, therefore the multi-column stat will not be used.

STATS on D (x2) -- Yes, single join column on table D

Look at the explain before and after you add a new statistics to see if the collection on the new statistics made a difference, if the estimate is closer to the actual row count. The real answer to whether or not specific statistics will help comes from recognizing if the plan got better with the stats and query times improved.

In general, queries with multiple columns that frequently appear together as selection criteria or join criteria, and for which values are expressed with an equality operator, will benefit from multi-column statistics.

Thanks, -Carrie  < Last>>

Attachments

Library

Quick Reference

Archives

Sample Index

 2016 2007 2015 2006 2014 2005 2013 2004 2012 2003 2011 2002 2010 2001 2009 2000 2008 1999

2012 Indexes

 Jan Jul Feb Aug Mar Sep Apr Oct May Nov Jun Dec

 Top Home Privacy Feedback 