|
|
Archives of the TeradataForum
Message Posted: Mon, 26 Sep 2005 @ 15:06:28 GMT
Subj: | | Redestributing large tables instead of replicating a small table |
|
From: | | Nomula, Madhukar |
I know this question is asked before, but I did not see the answer.
I am trying to join a large 40 million row table with a small 2600 row table. But the explain plan is re-distributing the large table instead
replicating the small table on all amps.
Here is the query:
SELECT * FROM
CONTACT C
LEFT JOIN
LOCATION D
ON
C.LOC_ID = D.LOC_ID;
The LOC_ID is the NUPI on LOCATION table. The stats are collected on both columns (C.LOC_ID and D.LOC_ID), which can be seen with high
confidence of the optimizer. But still here is explain plan. Isn't it cheaper to replicate the small table.
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct BIGB."pseudo table" for read on a RowHash to prevent global deadlock for BIGB.C.
| |
| 2) | Next, we lock a distinct BIGB."pseudo table" for read on a RowHash to prevent global deadlock for BIGB.D.
| |
| 3) | We lock BIGB.C for read, and we lock BIGB.D for read.
| |
| 4) | We do an all-AMPs RETRIEVE step from BIGB.C by way of an all-rows scan with no residual conditions into Spool 2 (all_amps), which is
redistributed by hash code to all AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result
spool file will not be cached in memory. The size of Spool 2 is estimated with high confidence to be 40,323,499 rows. The estimated time for
this step is 1 minute and 35 seconds.
| |
| 5) | We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to BIGB.D by way of an all-rows scan with
no residual conditions. Spool 2 and BIGB.D are left outer joined using a product join, with a join condition of ("LOC_ID = BIGB.D.LOC_ID"). The
result goes into Spool 1 (group_amps), which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool
1 is estimated with index join confidence to be 80,644,379 rows. The estimated time for this step is 1 minute and 32 seconds.
| |
| 6) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 3 minutes and 7 seconds.
| |
Thanks,
Madhukar.
| |