Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 05 Sep 2008 @ 16:49:27 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Product join without small table being duplcated on all AMPs
 
From:   Michael Larkins

Hi David:

I guess the real question that I would have is: how long it took the SQL to actually run vs the estimate in the EXPLAIN?

Regardless, the optimizer is attempting to always choose the optimal path. One of the unknowns for us is the number of AMPs on your system. We know from the EXPLAIN that your small table has 1632 rows with high confidence. Lets say you have 100 AMPS, which means that 1632 rows would need to be duplicated 100 times for a total of 163,200 rows with high confidence. It seems to think this is less efficient than redistributing both tables.

I would also think that the number and size of columns being selected from the large table might impact the decision of duplication vs redistribution. The more spool needed (compression being used) might make it more likely to refrain from redistributing the large table. At the same time, the "larger" table is assumed to be the larger table because the confidence level is low.

Why are there no statistics on this table? Does it change the execution plan when there are statistics on that table too? With statistics you might see the optimizer choose to duplicate rather than redistribute. It can only do the best that it can with the information available to it.

Of course, a lot of this reply is speculation on my part based on what we are seeing in the EXPLAIN and little known about your system. Therefore, it should not be taken as Gospel on how the optimizer works.


Regards,

Michael Larkins
Certified Teradata Master
Certified Teradata SQL Instructor



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023