Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Sat, 09 Jul 2005 @ 13:31:10 GMT

  <Prev Next>   <<First <Prev Next> Last>>  

Subj:   Re: Question regarding Selection limits
From:   Henderson, John

Michael, Sanjaya and Shritypriya,

Thank you all for your input. After I had sent the email I spent some additional time thinking about the problem, and it finally dawned on me - I should use a controlled Cartesian Product. Here's a sample of what I finally came up with:

     /* Gather initial audience */
     Insert into dbs1.tbl1
     Select col1, col2, col3 from dbs1.tbl2
     Where col3 = 'X';

     /*  Back-fill to upper limit of required number of records */
     Insert into db1.tbl1
     Select col1, col2, col3
     from dbs1.tbl2 a
     join (select count(*) as row_count from dbs1.tbl1) b  /* Used in Qualify
     statement below */
     on b.row_count >= 0  /* User >0 if you want this query to return no rows
     if the Original Query failed or had zero results */
     Where not col1 in (select col1 from dbs1.tbl1)  /* Not already selected
     Quality (csum(1,col1) + b.row_count) <= 500000

Explanation: The second insert performs a Join resulting in a Cartesian Product because there are no common columns between the From (Tbl A) and Sub-Select (Tbl B) tables. That places the row count of the originally selected group on each result row in Spool. Then, the Qualify statement adds the existing row-count with a cumulative sum from the newer result-set and makes certain that the total number of rows is not over the upper limit needed for the back-fill. My next step is to incorporate a Random function within the set of queries and use that column as the ordering column in the cumulative sum instead of one of the data columns so that the results can truly be "random" in the second select.

I actually tried this out in a set of queries that I had to perform yesterday afternoon and it works quite nicely - in one particular set of queries, my customer asked to first pull one group of records, add a second group, and then back-fill with random records to 625K rows, and I ended up using the second query (back-fill) for the 2nd and 3rd group of records needed to fill the order. This had to be done, since the entire group of result records could have exceeded the 625K upper limit while pulling the second group of records - meaning the random "back-fill" would not have been needed at all. Worked like a charm.



  <Prev Next>   <<First <Prev Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023