Archives of the TeradataForum

Message Posted: Sat, 12 Feb 2005 @ 11:25:42 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Guidelines for using Distinct vs. group by

From:		Dieter Noeth

Michael Larkins wrote:

Today, the optimizer eliminates duplicates without eliminating dups because it is more efficient. The Inclusion merge join means simply that when the join step finds the first match, it stops looking for a secondary match for the table(s) referenced in the subquery. It would then move to the next row from the main table and compare it to the previous matching row from the subquery values. This is the difference between a merge join and an inclusion merge join - eliminates dups without actually doing a redistribution to eliminate dup values. Pretty ingenious!!

Of course, but only for subqueries on the PI.

Just explain a non-PI subquery:

     select * from job
     where job_code in
       (select job_code from employee);

        4) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by
           way of an all-rows scan with no residual conditions into Spool 2
           (all_amps), which is redistributed by hash code to all AMPs.  Then
           we do a SORT to order Spool 2 by row hash and the sort key in
           spool field1 eliminating duplicate rows.  The size of Spool 2 is
           estimated with high confidence to be 26 rows.  The estimated time
           for this step is 0.01 seconds.

Dieter


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference