Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 16 Aug 2007 @ 15:36:49 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Only diff in Tables is PI column definition:
 
From:   Anomy Anom

<-- Anonymously Posted: Thursday, August 16, 2007 09:23 -->

Three tables are identical in every respect except one: the unique PI column is defined as Decimal (9, 0) in two cases, as Integer in the third. (NOTE: column spelling is identical.) In each case the PI is demog_key.

Does Teradata (particularly the Optimizer) treat the PI definitions as being identical? Will joins of the tables generate comparable explains/access_paths? I thought that the DEC vs INT would create some significant differences.

I have done some testing with a simple query (below) using a fairly small table (100M rows)and have provided the Explains which are slightly different. Note: the age_months column always contains NULL. The full table contains 56 columns. No Join Indexes.

Any experience or thoughts with this situation?

I have done some testing with approx 4M row table and found that the Perm Space Usage by AMP is identical (same number of rows and values in columns). Note, stats have been collected on the columns present in the query(s) and are identical.

Query 1, demog_key (the PI column) is defined as Integer and Decimal(9,0) respectively.

     locking row for access
     select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
       from    d_demog_xxxint     aa
                 ,d_demog_xxxdec9    bb
     where aa.demog_key = bb.demog_key
     and aa.demog_key < 1000
     ;

Explanation

Explanation
--------------------------------------------------
 
  1)First, we lock db01.bb for access, and we lock db01.aa for access.  
  2)Next, we do an all-AMPs RETRIEVE step from db01.aa by way of an all-rows scan with a condition of ("db01.aa.DEMOG_KEY < 1000") into Spool 2 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 1,003 rows. The estimated time for this step is 4.65 seconds.  
  3)We do an all-AMPs JOIN step from db01.bb by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use) by way of a RowHash match scan. db01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY = db01.bb.DEMOG_KEY"). The input table db01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 1,003 rows. The estimated time for this step is 0.17 seconds.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of statement 2. The total estimated time is 4.82 seconds.  


Query 2, both versions of table have demog_key (single column PI) defined as Decimal (9, 0)

     locking row for access
     select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
       from    d_demog            aa                  /**** demog_key is
     Decimal (9, 0)    ****/
                 ,d_demog_xxxdec9    bb
     where aa.demog_key = bb.demog_key
     and aa.demog_key < 1000
     ;

Explanation

Explanation
--------------------------------------------------
 
  1)First, we lock db01.bb for access, and we lock db01.aa for access.  
  2)Next, we do an all-AMPs RETRIEVE step from db01.aa by way of an all-rows scan with a condition of ("db01.aa.DEMOG_KEY < 1000.") into Spool 2 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 18,282 rows. The estimated time for this step is 4.66 seconds.  
  3)We do an all-AMPs JOIN step from db01.bb by way of a RowHash match scan with a condition of ("db01.bb.DEMOG_KEY <1000."), which is joined to Spool 2 (Last Use) by way of a RowHash match scan. db01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY = db01.bb.DEMOG_KEY"). The input table db01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 4,640 rows. The estimated time for this step is 2.98 seconds.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of statement 2. The total estimated time is 7.64 seconds.  


Query 3, demog_key is defined as Integer and Decimal(9, 0) respectively.

     locking row for access
     select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
       from    d_demog_xxxint       aa
                 ,d_demog     bb                    /*** Decimal 9, 0 ) ***/
     where aa.demog_key = bb.demog_key
     and aa.demog_key < 1000

Explanation

Explanation
--------------------------------------------------
 
  1)First, we lock DB01.aa for access, and we lock DB01.bb for access.  
  2)Next, we do an all-AMPs RETRIEVE step from DB01.aa by way of an all-rows scan with a condition of ("DB01.aa.DEMOG_KEY < 1000") into Spool 2 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 1,003 rows. The estimated time for this step is 4.65 seconds.  
  3)We do an all-AMPs JOIN step from DB01.bb by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use) by way of a RowHash match scan. DB01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY = DB01.bb.DEMOG_KEY"). The input table DB01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 1,003 rows. The estimated time for this step is 0.17 seconds.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of statement 2. The total estimated time is 4.82 seconds.  


Note: performing this query via self-joins of the tables generates one of the plans above (expected).

All versions of the table hold 3,648,866 rows and have identical space usage (by total and by amp).

I expected Query 1 and 3 to have identical explains, query 2 to have something different.

As always, thanks for sharing any insights.


Thanks



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 27 Dec 2016