|
|
Archives of the TeradataForum
Message Posted: Tue, 24 Aug 2004 @ 12:24:37 GMT
Subj: | | Re: Getting a random sample (sample or mod) |
|
From: | | Victor Sokovin |
| One of my colleagues needs to get a random sample of data from a transaction line level table (1 row per item bought per transaction) from
transactions made the previous week. | |
| Should he use the sample command? (I've heard this is very inefficient). My suggestion was to use the MOD command on transaction number to
get a sample, but he didn't seem to keen on this (I think he was worried that it may bias the sample somehow, although I can't personally see how
thsi would happen). | |
Do you mean MOD like in modulo, i.e., you take, for example, every 100th transaction (mod 100)? If so, this method is indeed too simple
to be even compared with the one used in TD. Geoffrey Rommel once explained what that one was:
www.teradataforum.com/teradata/20030220_133246.htm .
If you are concerned about the statistical quality of your samples go for the built-in SAMPLE. It might be slow at times but it is likely to be
faster than a home-grown alternative, and you can rely on the quality.
Regards,
Victor
| |