Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Thu, 11 Jan 2001 @ 14:49:50 GMT

  <Prev Next>   <<First <Prev Next> Last>>  

Subj:   Re: Limited Select
From:   Eric J. Kohut


As it has been explained to me. The Sample function selects the sample rows from combination Master Index / Cylinder Index until it has met the number of rows required or the percentage requested and then takes these rows from the table. So the work to do a sample should be much less. In my short tests, the work to do a sample on a single table is much less than to count the rows of that table, and vastly less than to scan the rows of that table and then limit the rows using retlimit and retcancel.

Also do limit the rows in a join you should do the sample in a derived table and then join the results of the derived table to the join table. This should restrict the rows before joining.

By the way, the function uses a random number generator to figure out which rows that it takes from which amps and since the data is usually (possibly not in very non unique NUPI) spread randomly across the AMPs, the sample process yields very random results. In a test on a table with UPI, the randomness of the results met the strict requirements of several statisticians at the retailer where I am currently working.

Eric J. Kohut
Senior Solutions Consultant - Teradata Solutions Group - Retail
NCR Corp.

  <Prev Next>   <<First <Prev Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 28 Jun 2020