Archives of the TeradataForum

Message Posted: Fri, 15 Sep 2006 @ 09:38:02 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Which method is faster?

From:		Michael Larkins

This is one of the areas I address often. DISTINCT hashs and sends every value as it is read out of the table. Then the values are sorted on the AMPs and duplicates are then eliminated.

GROUP BY does aggregation on each AMP to eliminate duplicates. Then, hashs and sends all the values that are left. After redistribution, each AMP with values does the aggregation again (at most one per AMP) to eliminate the duplicates.

Therefore, if the table contains many duplicate values, GROUP BY is more efficient. However, if the table has no or very few duplicates, DISTINCT is actually faster because it does not spend the time looking for duplicates that don't exist and then redistribute all the values. That is time that DISTINCT does not waste because it redistributes the values immediate.

So, it depends on your data which one is more efficient.

Regards,

Michael Larkins
Certified Teradata Master
Certified Teradata SQL Instructor


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Archives

2016		2007
2015		2006
2014		2005
2013		2004
2012		2003
2011		2002
2010		2001
2009		2000
2008		1999

2006 Indexes

Jan		Jul
Feb		Aug
Mar		Sep
Apr		Oct
May		Nov
Jun		Dec

Last Modified: 15 Jun 2023