Archives of the TeradataForum
Message Posted: Sat, 21 Jul 2007 @ 15:00:42 GMT
Subj: | | Re: Collecting statistics and 'no confidence' |
|
From: | | Anomy Anom |
<-- Anonymously Posted: Saturday, July 21, 2007 08:57 -->
It's generally not enough to have stats on the join columns, you need stats for other predicates as well to get 'high confidence'. Think about
it... if I say 'WHERE c3 > 17' then how many rows will be selected? Maybe 20%? All of them? None of them? Without stats on c3, you have no
clue.
No, you can't collect stats on spool. Teradata does try to 'carry forward' any stats info that is still relevant, though confidence level would
generally be 'low'.
Roughly speaking, 'high' means applicable stats were used and the optimizer thinks the estimate is as good as the stats. (Teradata trusts that
if you bother to collect stats, they are kept accurate and up to date - but that's another story.) 'Low' means an educated guess indirectly based
on some stats (but not the stats the optimizer really wanted to use because those were not collected). 'No' means some rule of thumb was applied
and/or a random sample of data had to be extrapolated as the starting point. Often this works OK if your data is fairly evenly distributed and has
a large number of distinct values, otherwise it should be a red flag.
|