|
|
Archives of the TeradataForum
Message Posted: Mon, 10 Nov 2003 @ 16:07:20 GMT
Subj: | | Re: Count of Distinct Rows |
|
From: | | Claybourne Barrineau |
Teradata is looking into this. Thanks for everyone's input!
I'm attaching an example of the query and a good and bad explain.
Thanks,
Clay
select t1.cma_no
from CMA_TABLES.cma t1
left outer join
CMA_TABLES.cma_org_select t2
on t1.cma_no = t2.cma_no
where t2.cma_no is null
and t1.cma_status = 'A';
********************************************
Without stats on T2's UPI:
The optimizer decides to redistribute T2 based on T1s UPI (CMA_No)
elapsed time=~2 seconds
********************************************
40 AMPs
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct CMA_TABLES."pseudo table" for read on a RowHash to prevent global deadlock for CMA_TABLES.t2.
| |
| 2) | Next, we lock a distinct CMA_TABLES."pseudo table" for read on a RowHash to prevent global deadlock for CMA_TABLES.t1.
| |
| 3) | We lock CMA_TABLES.t2 for read, and we lock CMA_TABLES.t1 for read.
| |
| 4) | We do an all-AMPs RETRIEVE step from CMA_TABLES.t2 by way of an all-rows scan with no residual conditions into Spool 2 (all_amps),
which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with low
confidence to be 3,617,040 rows. The estimated time for this step is 7.62 seconds.
| |
| 5) | We do an all-AMPs JOIN step from CMA_TABLES.t1 by way of a RowHash match scan with a condition of ("CMA_TABLES.t1.CMA_STATUS = 'A'"),
which is joined to Spool 2 (Last Use). CMA_TABLES.t1 and Spool 2 are left outer joined using a merge join, with a join condition of
("CMA_TABLES.t1.CMA_NO = Cma_No"). The result goes into Spool 3 (all_amps), which is built locally on the AMPs. The size of Spool 3 is
estimated with low confidence to be 2,761,961 rows. The estimated time for this step is 0.77 seconds.
| |
| 6) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan with a condition of ("Cma_No IS NULL") into Spool
1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 2,761,961 rows. The
estimated time for this step is 1.44 seconds.
| |
| 7) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 9.83 seconds.
| |
********************************************
With stats on T2's UPI (which looks like 1 unique value to the optimizer):
The optimizer decides to duplicate T2 to all amps then join to a T1 amp
locally
elapsed time=~1:55 seconds
********************************************
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct CMA_TABLES."pseudo table" for read on a RowHash to prevent global deadlock for CMA_TABLES.t2.
| |
| 2) | Next, we lock a distinct CMA_TABLES."pseudo table" for read on a RowHash to prevent global deadlock for CMA_TABLES.t1.
| |
| 3) | We lock CMA_TABLES.t2 for read, and we lock CMA_TABLES.t1 for read.
| |
| 4) | We execute the following steps in parallel.
| |
| |
| 1) | We do an all-AMPs RETRIEVE step from CMA_TABLES.t1 by way of an all-rows scan with a condition of ("CMA_TABLES.t1.CMA_STATUS =
'A'") into Spool 2 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is
estimated with high confidence to be 253,925 rows. The estimated time for this step is 0.76 seconds.
| | |
| |
| 2) | We do an all-AMPs RETRIEVE step from CMA_TABLES.t2 by way of an all-rows scan with no residual conditions into Spool 3 (all_amps),
which is duplicated on all AMPs. Then we do a SORT to order Spool 3 by row hash. The result spool file will not be cached in memory. The
size of Spool 3 is estimated with high confidence to be 144,205,840 rows. The estimated time for this step is 6 minutes and 20 seconds.
| | |
| 5) | We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan, which is joined to Spool 3 (Last Use). Spool 2
and Spool 3 are left outer joined using a merge join, with a join condition of ("CMA_NO = Cma_No"). The result goes into Spool 4
(all_amps), which is built locally on the AMPs. The size of Spool 4 is estimated with low confidence to be 2,752,879 rows. The estimated
time for this step is 12.36 seconds.
| |
| 6) | We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of an all-rows scan with a condition of ("Cma_No IS NULL") into Spool
1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 2,752,879 rows. The
estimated time for this step is 1.43 seconds.
| |
| 7) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 6 minutes and 34
seconds.
| |
| |