Archives of the TeradataForum
Message Posted: Fri, 02 Jul 2004 @ 11:40:55 GMT
| Subj: || || Duplication followed by Merge Join |
| From: || || David Clough |
Can anyone please 'explain' why the following Explain Plan shows a Duplicate on All Amps followed by a MERGE JOIN?
| || || || |
| ||2)||Next, we execute the following steps in parallel.
| || |
| || 1) ||We do an all-AMPs RETRIEVE step from PRODCOE_T.CONENTRY by way of an all-rows scan with a condition of ("PRODCOE_T.CONENTRY.COU_ID_ACC =
'NL '") into Spool 4 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 4 by row hash. The result spool file will
not be cached in memory. The size of Spool 4 is estimated with high confidence to be 22,574,687 rows. The estimated time for this step is 5
minutes and 11 seconds.
| || |
| || 2) ||We do an all-AMPs RETRIEVE step from PRODRATING_T.PTSCALELVL by way of an all-rows scan with no residual conditions into Spool 5
(all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with high
confidence to be 314,928 rows. The estimated time for this step is 0.33 seconds.
| ||3)||We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of a RowHash match scan, which is joined to Spool 5 (Last Use). Spool 4 and
Spool 5 are left outer joined using a merge join, with a join condition of ("(SOS_SEQ_ID = (SOS_SEQ_ID )) AND ((PTS_ID = (PTS_ID )) AND ((PFT_ID =
(PFT_ID )) AND ((DIV_ID = (DIV_ID )) AND (COU_ID_ACC = (COU_ID )))))"). The result goes into Spool 3 (all_amps), which is built locally on the
AMPs. The size of Spool 3 is estimated with low confidence to be 22,568,805 rows. The estimated time for this step is 12.45 seconds.
The Primary Indexes on the two tables are as follows:
PRIMARY INDEX PTSCALELVL_NUPI ( COU_ID ,DIV_ID ,PFT_ID );
PRIMARY INDEX CONENTRY_NUPI ( COU_ID_INV ,INV_NR ,INR_ID );
i.e. certainly NOT co-located.
Also, the Duplication statistic (22,574,687) is based upon 24 Amps.
Also (again), it's Version V2R5.0.
I can understand a Duplication followed by a Product Join or even a Hash Join, but I thought a Merge join was only applicable for the situation
where one (or both) tables have been REDISTRIBUTED, not DUPLICATED.
Database Design Group