Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 03 Aug 2006 @ 17:26:44 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Recommendations from Statistics Wizard
 
From:   Dieter Noeth

Martin Barrow:

  Does anyone know where is the sampling rate is held, nothing obvious in the dbs control settings?  


It's stored in dbscontrol, but it's one of the Internal fields:

     46.  CollectStatsSample     = 2 (percent)

     46. The recommended sample size used by Collect Statistics when
          run in the sampling mode (USING SAMPLE syntax). Actual size
          may be higher if data skew is detected. Size is specified as
          as a percentage from 1 to 99.

  I presume that the sampling also has an impact on the min and max values that the stats usual gather for the column?  


Those values might change, that's why you should compare full/sample stats (and explains), if there are too many differences...


  For info, the table being used is a 6.5 billion row table and the column having the stats amended is a PPI column for the table.  


Never ever collect sample stats on the partitioning column. Sample stats are not on a SAMPLE, it simply starts a full table scan. But rows in a PPI table are sorted by the partitioning column: e.g. if you partition by date, your table holds 2 years and the sample stops after 2 percent, then only the first 2 weeks where scanned, i've learned that the hard way :-(

At least in V2R6.1 a sample on the pseudo column PARTITION will be silently converted to full stats, but i just checked it on the partitioning column and it's still doing that stupid sample. I'd prefer an error message, disallowing it.


Dieter



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023