Archives of the TeradataForum

Message Posted: Mon, 05 Mar 2007 @ 20:02:43 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Performance of a self join

From:		Curley, David

I wouldn't call that a classic self join - you're summarizing A and then joining to that summary (I'm assuming that A is also in the select list, not just the derived table B). As with any join, if it's not on the PI, it's going to require redistribution or duplication somewhere along the line.

In your example, you mightn't need the derived table at all. Since all it's doing is summarizing by dim1, dim2, dim3, you can probably push all that into the OLAP function. You don't say how you're calculating "Value 2", but something like this might work:

     Select A.*,
     A.value1/ (sum(A.metric1) over (partition by A.dim1, A.dim2, A.dim3)
                /
                sum(A.metric2) over (partition by A.dim1, A.dim2, A.dim3))
     >From A

That should be faster, but it won't be _that_ fast unless A has a PI of (dim1, dim2, dim3). If it doesn't, it'll have to redistribute the table in order to get all rows in each partition on the same AMP. (I'd expect that Teradata would recognize if an OLAP partition clause matches the PI and so not attempt to redistribute, but I don't know if that's true - can anybody else confirm that?) It also may or may not work if you add additional criteria to the where clause, it depends on the population of rows you want to use to calculate the denominator.

If this is for a tactical application or will be run frequently, you might want to look at using a join index to physicalize all or part of the work the query has to do.

Dave


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference