Re: ROWID based join! [From Ballinger, Carrie: Fri, 12 Aug 2005 @ 19:52 GMT]

Message Posted: Fri, 12 Aug 2005 @ 19:52:25 GMT

Subj:

Re: ROWID based join!

From:

Ballinger, Carrie

2. What is a ROWID based join? - I've come across rowhash based and rowkey based join -- but what is a ROWID based join? I just sifted thru the materials i've got and tried some explain but couldnt find out. Is there such a join - anyone?

Aananth,

A row ID join is the second part of a nested join. The first part, referred to as "nested join" in the explain text, builds a spool file of rowIDs based on joining to the NUSI subtable rows. Those row IDs are input to a subsequent second step which performs the row ID join. The row ID join joins a spool of rowIDs (and other data from upstream)to the actual base table rows.

Here is an example of an explain that illustrates that. The row ID join is in step 5:

explain select * from customer,newline where c_newlinekey = n_newlinekey and n_name = 'SENIOR'

Explanation --------------------------------------------------

1)

First, we lock a distinct TEST."pseudo table" for read on a RowHash to prevent global deadlock for TEST.customer.

2)

Next, we lock TEST.customer for read.

3)

We do a single-AMP RETRIEVE step from TEST.newline by way of the primary index "TEST.newline.N_NAME = 'SENIOR '" with a residual condition of ("TEST.newline.N_NAME = 'SENIOR '") into Spool 2 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 20 rows. The estimated time for this step is 0.01 seconds.

4)

We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of a traversal of index # 4 without accessing the base table extracting row ids only. Spool 2 and TEST.customer are joined using a nested join, with a join condition of ("TEST.customer.C_NEWLINEKEY = N_NEWLINEKEY"). The input table TEST.customer will not be cached in memory. The result goes into Spool 3 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 3 by field Id 1. The size of Spool 3 is estimated with low confidence to be 300,000 rows. The estimated time for this step is 6.29 seconds.

5)

We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to TEST.customer by way of an all-rows scan with no residual conditions. Spool 3 and TEST.customer are joined using a row id join, with a join condition of ("(1=1)"). The input table TEST.customer will not be cached in memory. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 300,000 rows. The estimated time for this step is 7.52 seconds.

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Fri, 12 Aug 2005 @ 19:52:25 GMT