|
|
Archives of the TeradataForum
Message Posted: Fri, 12 Aug 2005 @ 19:52:25 GMT
Subj: | | Re: ROWID based join! |
|
From: | | Ballinger, Carrie |
| 2. What is a ROWID based join? - I've come across rowhash based and rowkey based join -- but what is a ROWID based join? I just sifted thru
the materials i've got and tried some explain but couldnt find out. Is there such a join - anyone? | |
Aananth,
A row ID join is the second part of a nested join. The first part, referred to as "nested join" in the explain text, builds a spool file of
rowIDs based on joining to the NUSI subtable rows. Those row IDs are input to a subsequent second step which performs the row ID join. The row
ID join joins a spool of rowIDs (and other data from upstream)to the actual base table rows.
Here is an example of an explain that illustrates that. The row ID join is in step 5:
explain
select * from customer,newline
where c_newlinekey = n_newlinekey
and n_name = 'SENIOR'
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct TEST."pseudo table" for read on a RowHash to prevent global deadlock for TEST.customer.
| |
| 2) | Next, we lock TEST.customer for read.
| |
| 3) | We do a single-AMP RETRIEVE step from TEST.newline by way of the primary index "TEST.newline.N_NAME = 'SENIOR '" with a residual condition
of ("TEST.newline.N_NAME = 'SENIOR '") into Spool 2 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 2 by row hash.
The size of Spool 2 is estimated with high confidence to be 20 rows. The estimated time for this step is 0.01 seconds.
| |
| 4) | We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of a traversal of
index # 4 without accessing the base table extracting row ids only. Spool 2 and TEST.customer are joined using a nested join, with a join
condition of ("TEST.customer.C_NEWLINEKEY = N_NEWLINEKEY"). The input table TEST.customer will not be cached in memory. The result goes into
Spool 3 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 3 by field Id 1. The size of Spool 3 is estimated with
low confidence to be 300,000 rows. The estimated time for this step is 6.29 seconds.
| |
| 5) | We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of an all-rows scan
with no residual conditions. Spool 3 and TEST.customer are joined using a row id join, with a join condition of ("(1=1)"). The input table
TEST.customer will not be cached in memory. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1
is estimated with low confidence to be 300,000 rows. The estimated time for this step is 7.52 seconds.
| |
| |