Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 03 Aug 2006 @ 15:19:47 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Recommendations from Statistics Wizard
 
From:   John Graas

TeradataForum wrote:

  Does anyone know where is the sampling rate is held, nothing obvious in the dbs control settings?  


  I presume that the sampling also has an impact on the min and max values that the stats usual gather for the column?  


  For info, the table being used is a 6.5 billion row table and the column having the stats amended is a PPI column for the table.  


Per the SQL Reference manual,

Indexed: starts at 2 percent, but is increased dynamically to ensure the accuracy of the statistics if significant skew is detected in the data. For large tables, the sample size rarely exceeds 50 percent.

Non-indexed: is fixed at approximately 2 percent.


And, if I read the manual correctly, PPI single column statistics ignore the SAMPLE option.

As for previous recommendations, the reason to use SAMPLE with CS is to accommodate run times. Don't use this for small tables: it doesn't buy you anything, and the statistics will be less accurate due to the small data set size being sampled. SAMPLE against large tables is more accurate due to the large size of the data set being sampled, statistically speaking. :-)


jdg

www.jgraas.com



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023