Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 02 Jul 2004 @ 11:40:55 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Duplication followed by Merge Join
 
From:   David Clough

Can anyone please 'explain' why the following Explain Plan shows a Duplicate on All Amps followed by a MERGE JOIN?

       
  2)Next, we execute the following steps in parallel.  
   
  1) We do an all-AMPs RETRIEVE step from PRODCOE_T.CONENTRY by way of an all-rows scan with a condition of ("PRODCOE_T.CONENTRY.COU_ID_ACC = 'NL '") into Spool 4 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 4 by row hash. The result spool file will not be cached in memory. The size of Spool 4 is estimated with high confidence to be 22,574,687 rows. The estimated time for this step is 5 minutes and 11 seconds.
 
   
  2) We do an all-AMPs RETRIEVE step from PRODRATING_T.PTSCALELVL by way of an all-rows scan with no residual conditions into Spool 5 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with high confidence to be 314,928 rows. The estimated time for this step is 0.33 seconds.
 
  3)We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of a RowHash match scan, which is joined to Spool 5 (Last Use). Spool 4 and Spool 5 are left outer joined using a merge join, with a join condition of ("(SOS_SEQ_ID = (SOS_SEQ_ID )) AND ((PTS_ID = (PTS_ID )) AND ((PFT_ID = (PFT_ID )) AND ((DIV_ID = (DIV_ID )) AND (COU_ID_ACC = (COU_ID )))))"). The result goes into Spool 3 (all_amps), which is built locally on the AMPs. The size of Spool 3 is estimated with low confidence to be 22,568,805 rows. The estimated time for this step is 12.45 seconds.  


The Primary Indexes on the two tables are as follows:

     PRIMARY INDEX PTSCALELVL_NUPI ( COU_ID ,DIV_ID ,PFT_ID );
     PRIMARY INDEX CONENTRY_NUPI ( COU_ID_INV ,INV_NR ,INR_ID );

i.e. certainly NOT co-located.

Also, the Duplication statistic (22,574,687) is based upon 24 Amps.

Also (again), it's Version V2R5.0.

I can understand a Duplication followed by a Product Join or even a Hash Join, but I thought a Merge join was only applicable for the situation where one (or both) tables have been REDISTRIBUTED, not DUPLICATED.


Thanks.

Dave Clough
Database Designer
Database Design Group



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023