Archives of the TeradataForum

Message Posted: Fri, 04 Aug 2006 @ 20:44:10 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: MSR Vs Union

From:		John Graas

TeradataForum wrote:

Could you please explain me the difference between MSR(multi statement request) and the Union.

I want to insert into a table from 3 different queries, should I use MSR or Union (or Union all) to do the job.

My initial guess was a MSR. But, let's test it:

Note: Into Empty Table

A) Union All (Union has an additional "get rid of duplicates step")
Run time: 2:48

     EXPLAIN for 2 steps:
        4) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           1") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        5) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           2") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 6,002,836 rows.  The estimated time for this
           step is 26.65 seconds.
        6) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
           an all-rows scan into Spool 2 (all_amps), which is built locally
           on the AMPs.  Then we do a SORT to order Spool 2 by row hash.  The
           result spool file will not be cached in memory.  The size of Spool
           2 is estimated with no confidence to be 6,002,836 rows.  The
           estimated time for this step is 39.39 seconds.
        7) We do an all-AMPs MERGE into lineitem_copy from Spool 2
           (Last Use).

B) MSR
Run time: 2:03

     EXPLAIN for 2 steps:
        4) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           1") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        5) We do an all-AMPs RETRIEVE step from lineitem by way of an
           all-rows scan with a condition of ("lineitem.L_LINENUMBER =
           2") into Spool 1 (all_amps), which is built locally on the AMPs.
           The input table will not be cached in memory, but it is eligible
           for synchronized scanning.  The size of Spool 1 is estimated with
           no confidence to be 3,001,418 rows.  The estimated time for this
           step is 26.65 seconds.
        6) We do a SORT to order Spool 1 by row hash.
        7) We do an all-AMPs MERGE into lineitem_copy from Spool 1
           (Last Use).

Thoughts: looks like that additional redistribution of SPOOL 1 to SPOOL 2 is the deal breaker.

Note: using a populated table as a target resulted in run times of 3:45 and 2:48 for Union and MSR, respectively. This means that both are taking advantage of the no TJ for an empty target table.

jdg

www.jgraas.com


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference