Archives of the TeradataForum
Message Posted: Mon, 09 May 2005 @ 08:56:32 GMT
Sometimes I wonder why a badly skewed join runs that badly. The situation arises in the final step in a Microstrategy report, where several volatile tables are joined to form the final result-table. The volatile tables are not particularly skewed (15% at the most). We've seen situations with around 6 to 8 volatile tables, with something like the following SQL:
select bla, bla from volatile1 left outer join volatile2 left outer join ... volatile8 inner join lookup1 inner join lookup2 ...
For a couple of the dimensions, where the volatile1-table is badly skewed, the optimizer decides to redistribute volatile1-table and do a merge join.
These steps can run for about 30 minutes, almost alone on the system, with only around 500.000 rows in the volatile1-table??
1) As we have 42 amps shouldn't this amount to a (non-skewed) merge join of a table with 22.000.000 rows? (And such a step would run quickly)
2) Is it because; it's an outer join?
3) Is it because volatile tables somehow are handled differently than permanent tables?
4) Why is the optimizer not able to detect the skew through the random Amp sample?
We are on R5.1.2, but we've also seen this behaviour on 5.0.3
Any information that could shed some light is very appreciated.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|