|
|
Archives of the TeradataForum
Message Posted: Sat, 12 Feb 2005 @ 11:25:42 GMT
Subj: | | Re: Guidelines for using Distinct vs. group by |
|
From: | | Dieter Noeth |
Michael Larkins wrote:
| Today, the optimizer eliminates duplicates without eliminating dups because it is more efficient. The Inclusion merge join means simply
that when the join step finds the first match, it stops looking for a secondary match for the table(s) referenced in the subquery. It would then
move to the next row from the main table and compare it to the previous matching row from the subquery values. This is the difference between a
merge join and an inclusion merge join - eliminates dups without actually doing a redistribution to eliminate dup values. Pretty
ingenious!! | |
Of course, but only for subqueries on the PI.
Just explain a non-PI subquery:
select * from job
where job_code in
(select job_code from employee);
4) We do an all-AMPs RETRIEVE step from CUSTOMER_SERVICE.employee by
way of an all-rows scan with no residual conditions into Spool 2
(all_amps), which is redistributed by hash code to all AMPs. Then
we do a SORT to order Spool 2 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 2 is
estimated with high confidence to be 26 rows. The estimated time
for this step is 0.01 seconds.
Dieter
| |