Archives of the TeradataForum

Message Posted: Mon, 29 Nov 2010 @ 21:17:01 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Sampled statistics in a production environment

From:		Ballinger, Carrie

JK, you asked:

On TD13.00, is there a 'guide-line' for stats on more non-unique columns like there was prior to 13 for mostly unique columns - i.e. 90- 95%? For example, it's good to consider when data is at least x% unique - a lower level? We're on TD13.00 and would like to investigate/try out.

I don't believe there are any guidelines. The bottom line is: are you getting good query plans? You may be able to get the same plans you are getting now with full stats by collecting with USING SAMPLE at some lower percent. What that percent will need to be is trial and error, or an educated guess. My suggestion is try USING SAMPLE then look at the HELP STATS output and compare the number of distinct values against what you saw in the HELP STATS with full statistics collected. See how close you are and see how the plans look.

I don't believe that the percent unique is a factor in T 13.0 USING SAMPLE, at least in my experience. If the values of your statistic are relatively evenly distributed, you should do well with USING SAMPLE whether they are almost unique or very non-unique. What you need to be careful about is if there is a lot of skew on the values of the statistic. I haven't done enough investigation to be able to tell you what degree of skew is tolerable, but that is where I expect USING SAMPLE in T 13 to be less useful.

Thanks, -Carrie


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference