Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Tue, 24 Aug 2004 @ 12:24:37 GMT

  <Prev Next>   <<First <Prev

Subj:   Re: Getting a random sample (sample or mod)
From:   Victor Sokovin

  One of my colleagues needs to get a random sample of data from a transaction line level table (1 row per item bought per transaction) from transactions made the previous week.  

  Should he use the sample command? (I've heard this is very inefficient). My suggestion was to use the MOD command on transaction number to get a sample, but he didn't seem to keen on this (I think he was worried that it may bias the sample somehow, although I can't personally see how thsi would happen).  

Do you mean MOD like in modulo, i.e., you take, for example, every 100th transaction (mod 100)? If so, this method is indeed too simple to be even compared with the one used in TD. Geoffrey Rommel once explained what that one was:

www.teradataforum.com/teradata/20030220_133246.htm .

If you are concerned about the statistical quality of your samples go for the built-in SAMPLE. It might be slow at times but it is likely to be faster than a home-grown alternative, and you can rely on the quality.



  <Prev Next>   <<First <Prev
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023