Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 09 May 2005 @ 08:56:32 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Why does a badly skewed join run that badly?
 
From:   Ole Dunweber

Hello,

Sometimes I wonder why a badly skewed join runs that badly. The situation arises in the final step in a Microstrategy report, where several volatile tables are joined to form the final result-table. The volatile tables are not particularly skewed (15% at the most). We've seen situations with around 6 to 8 volatile tables, with something like the following SQL:

     select bla, bla
       from volatile1 left outer join
            volatile2 left outer join
     ...
            volatile8 inner join
            lookup1   inner join
            lookup2
     ...

For a couple of the dimensions, where the volatile1-table is badly skewed, the optimizer decides to redistribute volatile1-table and do a merge join.

These steps can run for about 30 minutes, almost alone on the system, with only around 500.000 rows in the volatile1-table??

Questions:

1) As we have 42 amps shouldn't this amount to a (non-skewed) merge join of a table with 22.000.000 rows? (And such a step would run quickly)

2) Is it because; it's an outer join?

3) Is it because volatile tables somehow are handled differently than permanent tables?

4) Why is the optimizer not able to detect the skew through the random Amp sample?


We are on R5.1.2, but we've also seen this behaviour on 5.0.3

Any information that could shed some light is very appreciated.


Ole Dunweber
Coop Nordic



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023