|
|
Archives of the TeradataForum
Message Posted: Mon, 19 Jan 2004 @ 17:16:00 GMT
Subj: | | Re: V2R5 Statistics |
|
From: | | Victor Sokovin |
Craig,
| So in order to make sure that my table has the best statistics collection for adhoc queries to be run, do I need both types of statistics
to be collected, or will multi column statistics be adequate? | |
These statistics will serve different queries. Theoretically, you could take the largest case (6 columns?) and calculate 6-dim statistics
and all possible projections (from 1- to 5-dim), if users often define 1 to 5 WHERE clauses (or columns in joins) in their ad hoc analysis ot
batch jobs etc. It will be quite a task to collect them all for a large table:
1-dim: 6 statistics;
2-dim: 15;
3-dim: 20;
4-dim: 15;
5-dim: 6;
6-dim: 1.
You can collect as many as you can afford time-wise (up to 63 statistics listed above); all of them might prove useful in busy (OLAP,
data mining ...) applications. If you don't have time for all of them (most people won't) - just collect the multi-column stats for as many
columns as the largest number of where clauses (or columns in joins) in the query you care about. You could also add a few most frequently used
single-column stats (as in R4). Perhaps you will find your optimum somewhere between these two extreme cases.
Victor
| |