Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 19 Jan 2004 @ 17:16:00 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: V2R5 Statistics
 
From:   Victor Sokovin

Craig,

  So in order to make sure that my table has the best statistics collection for adhoc queries to be run, do I need both types of statistics to be collected, or will multi column statistics be adequate?  



These statistics will serve different queries. Theoretically, you could take the largest case (6 columns?) and calculate 6-dim statistics and all possible projections (from 1- to 5-dim), if users often define 1 to 5 WHERE clauses (or columns in joins) in their ad hoc analysis ot batch jobs etc. It will be quite a task to collect them all for a large table:

1-dim: 6 statistics;
2-dim: 15;
3-dim: 20;
4-dim: 15;
5-dim: 6;
6-dim: 1.


You can collect as many as you can afford time-wise (up to 63 statistics listed above); all of them might prove useful in busy (OLAP, data mining ...) applications. If you don't have time for all of them (most people won't) - just collect the multi-column stats for as many columns as the largest number of where clauses (or columns in joins) in the query you care about. You could also add a few most frequently used single-column stats (as in R4). Perhaps you will find your optimum somewhere between these two extreme cases.


Victor



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023