Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 11 Apr 2005 @ 21:24:31 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Has any had success with views that UNION Tables
 
From:   Prescott, Kyle R

V2r5.0 introduced some nice enhancements around this feature. It's called Horizontal Partitioning (another way to divide and conquer the data volumes). You split the tables based on a criteria set and have one view to present the data back to the consumer as one logical set.

The v2r5 difference is the use of constraints on the tables to feed the optimizer clues on how the data is split. If the end-consumer query (using your over-arching view) uses the criteria in the constraint, the optimizer drops the partitions that cannot satisfy the query due to the known constraints.

Example (800 million rows equally distributed by month):

     Create table1
     (Keycol integer
      Txn_date date
      Other_data varchar(100))
     Primary index (keycol);

Can be changed to 4 quarterly tables of 200 million rows each

     Create table table1_q1 as (select * from table1 where extract(month from
     txn_date) between 1 and 3) with data ;
     Create table table1_q2 as (select * from table1 where extract(month from
     txn_date) between 4 and 6) with data primary index (keycol);
     Create table table1_q3 as (select * from table1 where extract(month from
     txn_date) between 7 and 9) with data primary index (keycol);
     Create table table1_q4 as (select * from table1 where extract(month from
     txn_date) between 10 and 12) with data primary index (keycol);

Add the constraints to tell the optimizer about the splits on the tables:

     Alter table table1_q1
        Add constraint check_horiz_part
              check ( extract(month from txn_date) between 1 and 3 );

     Alter table table1_q2
        Add constraint check_horiz_part
              check ( extract(month from txn_date) between 4 and 6 );

     Alter table table1_q3
        Add constraint check_horiz_part
              check ( extract(month from txn_date) between 7 and 9 );

      Alter table table1_q4
        Add constraint check_horiz_part
              check ( extract(month from txn_date) between 10 and 12 );

Create the view to present the data as one logical set:

     create view1 as
         locking row for access
         select * from table1_q1
         union all select * from table1_q2
         union all select * from table1_q3
         union all select * from table1_q4;

IF the constraint is used within the query (example: select * from view1 where extract(month from txn_date) = 4) and the user selects month of 04 for April), then tables q1, q3, and q4 will not be referenced or accessed. The explain shows only a retrieve from table1_q2 since the table-level constraints guarantee the optimizer that only the Q2 table satisfies the query.

In this example, you can save scanning 600 million rows out of the 800 million original rows. You really need to do your homework on the user access plans vs. the table splits in order to get the most optimization.

Also, you can mload/tpump up to 5 tables in one script and pass so the four tables can still be loaded by single-pass process.

Note: For this to work however properly the constraint must resolve to a numeric value (similar to PPI rules).


Kyle Prescott
UnumProvident, Data Strategies



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023