Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Mon, 09 May 2005 @ 08:56:32 GMT

  <Prev Next>  
Next> Last>>  

Subj:   Why does a badly skewed join run that badly?
From:   Ole Dunweber


Sometimes I wonder why a badly skewed join runs that badly. The situation arises in the final step in a Microstrategy report, where several volatile tables are joined to form the final result-table. The volatile tables are not particularly skewed (15% at the most). We've seen situations with around 6 to 8 volatile tables, with something like the following SQL:

     select bla, bla
       from volatile1 left outer join
            volatile2 left outer join
            volatile8 inner join
            lookup1   inner join

For a couple of the dimensions, where the volatile1-table is badly skewed, the optimizer decides to redistribute volatile1-table and do a merge join.

These steps can run for about 30 minutes, almost alone on the system, with only around 500.000 rows in the volatile1-table??


1) As we have 42 amps shouldn't this amount to a (non-skewed) merge join of a table with 22.000.000 rows? (And such a step would run quickly)

2) Is it because; it's an outer join?

3) Is it because volatile tables somehow are handled differently than permanent tables?

4) Why is the optimizer not able to detect the skew through the random Amp sample?

We are on R5.1.2, but we've also seen this behaviour on 5.0.3

Any information that could shed some light is very appreciated.

Ole Dunweber
Coop Nordic

  <Prev Next>  
Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023