|
|
Archives of the TeradataForum
Message Posted: Wed, 02 Oct 2002 @ 17:52:15 GMT
Subj: | | Re: Left join problem |
|
From: | | Burton, Bruce |
It looks like Teradata uses a hash join with the 'bad' query but a merge join with the 'good' query. Not sure why Teradata decided to
use the different methods for the queries...I assume something with the indexing but not sure. Does the 'good' query using sub-selects
strip off index info?
Here are the explains from the two queries:
Query that works correctly:
---------------------------
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct mktg_temp."pseudo table" for read on a RowHash to prevent global deadlock for mktg_temp.p065_snet_input.
| |
| 2) | Next, we lock mktg_temp.p065_snet_input for read, and we lock CMDMDB.Static for access.
| |
| 3) | We execute the following steps in parallel.
| |
| |
| 1) | We do an all-AMPs RETRIEVE step from mktg_temp.p065_snet_input by way of an all-rows scan with no residual conditions into Spool 2,
which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 236,880 rows. The estimated time for this
step is 0.22 seconds.
| | |
| |
| 2) | We do an all-AMPs RETRIEVE step from CMDMDB.Static by way of an all-rows scan with no residual conditions into Spool 1, which is
built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 26,605,513 rows. The estimated time for this step
is 51.89 seconds.
| | |
| 4) | We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of an all-rows scan into Spool 4, which is redistributed by hash code
to all AMPs. Then we do a SORT to order Spool 4 by row hash. The size of Spool 4 is estimated with high confidence to be 26,605,513 rows.
The estimated time for this step is 1 minute and 27 seconds.
| |
| 5) | We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan., which is joined to Spool 4 (Last Use). Spool 2
and Spool 4 are left outer joined using a merge join, with a join condition of ("ACCT2 = ACCT"). The result goes into Spool 3, which is
built locally on the AMPs. The size of Spool 3 is estimated with no confidence to be 73,019,005 to 6,302,313,919,440 rows. The estimated
time for this step is 40.78 seconds to 704 hours and 11 minutes.
| |
| 6) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 3 are sent back to the user as the result of statement 1. The total estimated time is 2 minutes and 59
seconds.
| |
Query that *does not* work correctly:
-------------------------------------
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct mktg_temp."pseudo table" for read on a RowHash to prevent global deadlock for mktg_temp.t1.
| |
| 2) | Next, we lock mktg_temp.t1 for read, and we lock CMDMDB.Static for access.
| |
| 3) | We do an all-AMPs RETRIEVE step from mktg_temp.t1 by way of an all-rows scan with no residual conditions into Spool 2 fanned out into
6 hash join partitions, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool
2 is estimated with no confidence to be 236,880 rows.
| |
| 4) | We do an all-AMPs RETRIEVE step from Spool 2 by way of an all-rows scan into Spool 3 fanned out into 6 hash join partitions, which is
duplicated on all AMPs. The size of Spool 3 is estimated with low confidence to be 66,326,400 rows. The estimated time for this step is 1
minute and 16 seconds.
| |
| 5) | We do an all-AMPs RETRIEVE step from CMDMDB.Static by way of an all-rows scan with no residual conditions into Spool 4 fanned out
into 6 hash join partitions, which is built locally on the AMPs. The size of Spool 4 is estimated with high confidence to be 26,605,513
rows. The estimated time for this step is 1 minute and 40 seconds.
| |
| 6) | We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to Spool 4 (Last Use). Spool 3 and
Spool 4 are joined using a hash join of 6 partitions, with a join condition of ("acct = acct"). The result goes into Spool 5, which is
redistributed by hash code to all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with low
confidence to be 316,810 rows. The estimated time for this step is 1 minute and 39 seconds.
| |
| 7) | We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan., which is joined to Spool 5 (Last Use). Spool 2
and Spool 5 are left outer joined using a merge join of 6 partitions, with a join condition of ("Field_1 = Field_1"). The result goes into
Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 316,810 rows. The estimated time
for this step is 0.16 seconds.
| |
| 8) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 4 minutes and 35
seconds.
| |
| |