Archives of the TeradataForum
Message Posted: Tue, 30 Jul 2002 @ 13:58:51 GMT
I have a case where a hash join is efficient than a merge join. The problem is, in this case, that the optimizer always wants to perform a merge join. Yes...I have collected all the necessary statistics. My system has 110 amps.
*Note = Logical Primary Key (ColumnA, ColumnB, ColumnC, ColumnD)
Average Record Count Per Amp = ~30,000
TableB= ( ColumnA Integer Not Null ColumnE ColumnF ... ) NUPI (ColumnA)
*Note = Logical Primary Key (ColumnA, ColumnE, ColumnF)
Average Record Count Per Amp = ~7,000
Example of Query =
Given the information that the Optimizer has to work with, I cannot fault its decision to perform a merge join; however, as it turns out, the resulting work load when performing a merge join is highly skewed.
Merge Join - ~# of Joins
Via changes to the DDL of Table A (UPI = Logical Primary Key) and Table B (NUPI = New Column with Value of '1' for every record), I can force a hash join.
Hash Join - ~# of Joins
The hash join query runs faster and the workload on the box is much more evenly distributed (in terms of Spool size and CPU usage).
Does anybody think that it is even possible for the Optimizer to be smart enough to realize that a hash join (in this case) is more reasonable than a merge join?
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|