|
|
Archives of the TeradataForum
Message Posted: Mon, 29 Nov 2010 @ 21:17:01 GMT
Subj: | | Re: Sampled statistics in a production environment |
|
From: | | Ballinger, Carrie |
JK, you asked:
| On TD13.00, is there a 'guide-line' for stats on more non-unique columns like there was prior to 13 for mostly unique columns - i.e. 90-
95%? For example, it's good to consider when data is at least x% unique - a lower level? We're on TD13.00 and would like to investigate/try
out. | |
I don't believe there are any guidelines. The bottom line is: are you getting good query plans? You may be able to get the same plans you are
getting now with full stats by collecting with USING SAMPLE at some lower percent. What that percent will need to be is trial and error, or an
educated guess. My suggestion is try USING SAMPLE then look at the HELP STATS output and compare the number of distinct values against what you
saw in the HELP STATS with full statistics collected. See how close you are and see how the plans look.
I don't believe that the percent unique is a factor in T 13.0 USING SAMPLE, at least in my experience. If the values of your statistic are
relatively evenly distributed, you should do well with USING SAMPLE whether they are almost unique or very non-unique. What you need to be
careful about is if there is a lot of skew on the values of the statistic. I haven't done enough investigation to be able to tell you what degree
of skew is tolerable, but that is where I expect USING SAMPLE in T 13 to be less useful.
Thanks, -Carrie
| |