Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 12 Aug 2005 @ 19:52:25 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: ROWID based join!
 
From:   Ballinger, Carrie

  2. What is a ROWID based join? - I've come across rowhash based and rowkey based join -- but what is a ROWID based join? I just sifted thru the materials i've got and tried some explain but couldnt find out. Is there such a join - anyone?  


Aananth,

A row ID join is the second part of a nested join. The first part, referred to as "nested join" in the explain text, builds a spool file of rowIDs based on joining to the NUSI subtable rows. Those row IDs are input to a subsequent second step which performs the row ID join. The row ID join joins a spool of rowIDs (and other data from upstream)to the actual base table rows.

Here is an example of an explain that illustrates that. The row ID join is in step 5:

     explain
     select * from customer,newline
     where c_newlinekey = n_newlinekey
     and n_name = 'SENIOR'
Explanation
--------------------------------------------------
 
  1)First, we lock a distinct TEST."pseudo table" for read on a RowHash to prevent global deadlock for TEST.customer.  
  2)Next, we lock TEST.customer for read.  
  3)We do a single-AMP RETRIEVE step from TEST.newline by way of the primary index "TEST.newline.N_NAME = 'SENIOR '" with a residual condition of ("TEST.newline.N_NAME = 'SENIOR '") into Spool 2 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 20 rows. The estimated time for this step is 0.01 seconds.  
  4)We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of a traversal of index # 4 without accessing the base table extracting row ids only. Spool 2 and TEST.customer are joined using a nested join, with a join condition of ("TEST.customer.C_NEWLINEKEY = N_NEWLINEKEY"). The input table TEST.customer will not be cached in memory. The result goes into Spool 3 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 3 by field Id 1. The size of Spool 3 is estimated with low confidence to be 300,000 rows. The estimated time for this step is 6.29 seconds.  
  5)We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of an all-rows scan with no residual conditions. Spool 3 and TEST.customer are joined using a row id join, with a join condition of ("(1=1)"). The input table TEST.customer will not be cached in memory. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 300,000 rows. The estimated time for this step is 7.52 seconds.  



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023