Re: Question regarding Selection limits [From Henderson, John: Sat, 09 Jul 2005 @ 13:31 GMT]

Message Posted: Sat, 09 Jul 2005 @ 13:31:10 GMT

Subj:

Re: Question regarding Selection limits

From:

Henderson, John

Michael, Sanjaya and Shritypriya,

Thank you all for your input. After I had sent the email I spent some additional time thinking about the problem, and it finally dawned on me - I should use a controlled Cartesian Product. Here's a sample of what I finally came up with:

/* Gather initial audience */ Insert into dbs1.tbl1 Select col1, col2, col3 from dbs1.tbl2 Where col3 = 'X'; /* Back-fill to upper limit of required number of records */ Insert into db1.tbl1 Select col1, col2, col3 from dbs1.tbl2 a join (select count(*) as row_count from dbs1.tbl1) b /* Used in Qualify statement below */ on b.row_count >= 0 /* User >0 if you want this query to return no rows if the Original Query failed or had zero results */ Where not col1 in (select col1 from dbs1.tbl1) /* Not already selected */ Quality (csum(1,col1) + b.row_count) <= 500000 ;

Explanation: The second insert performs a Join resulting in a Cartesian Product because there are no common columns between the From (Tbl A) and Sub-Select (Tbl B) tables. That places the row count of the originally selected group on each result row in Spool. Then, the Qualify statement adds the existing row-count with a cumulative sum from the newer result-set and makes certain that the total number of rows is not over the upper limit needed for the back-fill. My next step is to incorporate a Random function within the set of queries and use that column as the ordering column in the cumulative sum instead of one of the data columns so that the results can truly be "random" in the second select.

I actually tried this out in a set of queries that I had to perform yesterday afternoon and it works quite nicely - in one particular set of queries, my customer asked to first pull one group of records, add a second group, and then back-fill with random records to 625K rows, and I ended up using the second query (back-fill) for the 2nd and 3rd group of records needed to fill the order. This had to be done, since the entire group of result records could have exceeded the 625K upper limit while pulling the second group of records - meaning the random "back-fill" would not have been needed at all. Worked like a charm.

Thanks,

John

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Sat, 09 Jul 2005 @ 13:31:10 GMT