Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 04 Aug 2006 @ 20:44:10 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: MSR Vs Union
 
From:   John Graas

TeradataForum wrote:

  Could you please explain me the difference between MSR(multi statement request) and the Union.  


  I want to insert into a table from 3 different queries, should I use MSR or Union (or Union all) to do the job.  


My initial guess was a MSR. But, let's test it:

Note: Into Empty Table

A) Union All (Union has an additional "get rid of duplicates step")
Run time: 2:48

     EXPLAIN for 2 steps:
        4) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           1") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        5) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           2") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 6,002,836 rows.  The estimated time for this
           step is 26.65 seconds.
        6) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
           an all-rows scan into Spool 2 (all_amps), which is built locally
           on the AMPs.  Then we do a SORT to order Spool 2 by row hash.  The
           result spool file will not be cached in memory.  The size of Spool
           2 is estimated with no confidence to be 6,002,836 rows.  The
           estimated time for this step is 39.39 seconds.
        7) We do an all-AMPs MERGE into lineitem_copy from Spool 2
           (Last Use).

B) MSR
Run time: 2:03

     EXPLAIN for 2 steps:
        4) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           1") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        5) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           2") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        6) We do a SORT to order Spool 1 by row hash.
        7) We do an all-AMPs MERGE into lineitem_copy from Spool 1
           (Last Use).

Thoughts: looks like that additional redistribution of SPOOL 1 to SPOOL 2 is the deal breaker.

Note: using a populated table as a target resulted in run times of 3:45 and 2:48 for Union and MSR, respectively. This means that both are taking advantage of the no TJ for an empty target table.


jdg

www.jgraas.com



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023