

Archives of the TeradataForum
Message Posted: Fri, 11 Feb 2005 @ 21:20:29 GMT
Subj:   Re: Guidelines for using Distinct vs. group by 

From:   Dieter Noeth 
David A Hough wrote:
 * Does the SQL standard call for a DISTINCT in the subquery, or is it left to the implementers?  
SQL Standard usually tells about expected results, but not how to achieve it.
 * Is the DISTINCT versus GROUP BY implementation difference on the Teradata unique to this platform (perhaps because the data is
partitioned on many nodes)?  
AFAIK most DBMSes show the same plan for both versions...
 * Does anyone have benchmark results for DISTINCT versus GROUP BY performance on the Teradata for different result set
percentages?  
Table populated with random values (10,000,000 rows, 4 AMPs),
CPU Time from DBQL:
Distinct CPU Time CPU Time
Values Distinct Group By
6320690 94,59 207,59
999949 90,61 133,17
100000 81,83 66,56
10000 77,47 16,00
1000 74,08 13,47
100 69,30 12,81
10 52,50 12,44
1 45,94 11,45
10000 38,92 11,52
The last result is for NUPIcolumns
I would be glad to see results for other (large) systems :)
create multiset table distinct_test
(x1 int,
x10 int,
x100 int,
x1000 int,
x10000 int,
x100000 int,
x100000nupi int,
x1000000 int,
x10000000 int
) primary index (x100000nupi)
;
insert into distinct_test
select
1,
r / 100000,
r / 10000,
r / 1000,
r / 100,
r / 10,
random(0,1000000) as r,
r,
random(2000000000,2000000000)
from
(select 1 as x from sys_calendar.calendar
sample with replacement 10000000) dt;
sel count(*) from (sel distinct x10000000 from distinct_test) dt;
sel count(*) from (sel x10000000 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x1000000 from distinct_test) dt;
sel count(*) from (sel x1000000 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x100000 from distinct_test) dt;
sel count(*) from (sel x100000 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x10000 from distinct_test) dt;
sel count(*) from (sel x10000 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x1000 from distinct_test) dt;
sel count(*) from (sel x1000 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x100 from distinct_test) dt;
sel count(*) from (sel x100 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x10 from distinct_test) dt;
sel count(*) from (sel x10 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x1 from distinct_test) dt;
sel count(*) from (sel x1 from distinct_test group by 1) dt;
sel count(*) from (sel distinct x100000nupi from distinct_test) dt;
sel count(*) from (sel x100000nupi from distinct_test group by 1) dt;
Dieter
 