Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 04 Aug 2004 @ 07:22:26 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Sampling of Pages
 
From:   Stephan Ewen

Hi,

I am not quite sure about the terminology, maybe pages is the wrong term, I am quite new to teradata, I come from a DB2 background... With pages I meant a unit that is read from the disk in one I/O operation and usually maps to one sector on the disk. Large tables naturally fill many pages and to get a statistically valid sample, the sampling would have to consider more or less all pages, since data might be clustered on the pages, that is why obtaining samples that should be statistically valid (i.e. robust to chi-square and spectral tests) is not so very performant. A great reductuion of I/Os is possible if you sample the pages, but the sample obtained is not always representative. Does this correspond to proportional sampling ? With the AMPs it sounds like query parallelism to me, rather than system or physical level.

Please correct me, if I am wrong, as I mentioned, I am new ;)


Thanks,

Stephan



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023