Archives of the TeradataForum
Message Posted: Thu, 03 Aug 2006 @ 17:26:44 GMT
Subj: | | Re: Recommendations from Statistics Wizard |
|
From: | | Dieter Noeth |
Martin Barrow:
| Does anyone know where is the sampling rate is held, nothing obvious in the dbs control settings? | |
It's stored in dbscontrol, but it's one of the Internal fields:
46. CollectStatsSample = 2 (percent)
46. The recommended sample size used by Collect Statistics when
run in the sampling mode (USING SAMPLE syntax). Actual size
may be higher if data skew is detected. Size is specified as
as a percentage from 1 to 99.
| I presume that the sampling also has an impact on the min and max values that the stats usual gather for the column? | |
Those values might change, that's why you should compare full/sample stats (and explains), if there are too many differences...
| For info, the table being used is a 6.5 billion row table and the column having the stats amended is a PPI column for the
table. | |
Never ever collect sample stats on the partitioning column. Sample stats are not on a SAMPLE, it simply starts a full table scan. But rows in
a PPI table are sorted by the partitioning column: e.g. if you partition by date, your table holds 2 years and the sample stops after 2 percent,
then only the first 2 weeks where scanned, i've learned that the hard way :-(
At least in V2R6.1 a sample on the pseudo column PARTITION will be silently converted to full stats, but i just checked it on the partitioning
column and it's still doing that stupid sample. I'd prefer an error message, disallowing it.
Dieter
|