Archives of the TeradataForum

Message Posted: Fri, 02 Jul 2004 @ 11:40:55 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Duplication followed by Merge Join

From:		David Clough

Can anyone please 'explain' why the following Explain Plan shows a Duplicate on All Amps followed by a MERGE JOIN?

2) Next, we execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from PRODCOE_T.CONENTRY by way of an all-rows scan with a condition of ("PRODCOE_T.CONENTRY.COU_ID_ACC = 'NL '") into Spool 4 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 4 by row hash. The result spool file will not be cached in memory. The size of Spool 4 is estimated with high confidence to be 22,574,687 rows. The estimated time for this step is 5 minutes and 11 seconds.

2) We do an all-AMPs RETRIEVE step from PRODRATING_T.PTSCALELVL by way of an all-rows scan with no residual conditions into Spool 5 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with high confidence to be 314,928 rows. The estimated time for this step is 0.33 seconds.

3) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of a RowHash match scan, which is joined to Spool 5 (Last Use). Spool 4 and Spool 5 are left outer joined using a merge join, with a join condition of ("(SOS_SEQ_ID = (SOS_SEQ_ID )) AND ((PTS_ID = (PTS_ID )) AND ((PFT_ID = (PFT_ID )) AND ((DIV_ID = (DIV_ID )) AND (COU_ID_ACC = (COU_ID )))))"). The result goes into Spool 3 (all_amps), which is built locally on the AMPs. The size of Spool 3 is estimated with low confidence to be 22,568,805 rows. The estimated time for this step is 12.45 seconds.

The Primary Indexes on the two tables are as follows:

     PRIMARY INDEX PTSCALELVL_NUPI ( COU_ID ,DIV_ID ,PFT_ID );
     PRIMARY INDEX CONENTRY_NUPI ( COU_ID_INV ,INV_NR ,INR_ID );

i.e. certainly NOT co-located.

Also, the Duplication statistic (22,574,687) is based upon 24 Amps.

Also (again), it's Version V2R5.0.

I can understand a Duplication followed by a Product Join or even a Hash Join, but I thought a Merge join was only applicable for the situation where one (or both) tables have been REDISTRIBUTED, not DUPLICATED.

Thanks.

Dave Clough
Database Designer
Database Design Group


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference