Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 06 Mar 2002 @ 13:54:29 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: What information does 'collect statistics' store?
 
From:   Geoffrey Rommel

  When statistics are collected on a column does it store the actual number of rows for each individual column value, or does it store an average number of rows per value?  


It stores the actual number of rows for certain values (including the most frequently occurring value within each percentile), but not for every value.


  e.g. A table has a date column which we have collected statistics on and we then load a new week's data to the table. If we run a query on the table against the new week's data do the statistics say that there will be no rows in the table and create an execution plan accordingly, or will it say that there is an average number of rows and create a different execution plan?  


Neither. I have encountered this very situation. The stats will include the maximum value for the date, so the optimizer will know that according to the stats there are no rows for your new week, but because it cannot rely completely on the stats it will almost certainly do a full-table scan.

In our case, we had a table of over 400GB, and adding a new month's worth of data increased its size by only 1.9%, so the full-table scan was a Bad Thing. The solution was to add "collect stats" to the monthly load job so that the stats on the date column would always be up to date.


--wgr



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023