Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 20 Sep 2012 @ 18:53:06 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Query performance issue - running long
 
From:   Clark, Dave

Your problem is with the estimates, not with skew. From the DBQLSTEPTBL information you provided we see that the Optimizer estimates off by a huge factor in step 12 ( estimated: 1,195,479 versus actual: 2,071,122,778). But the estimate actually starts to go awry in step 2.

     StepLev1Num   StepLev2Num   StepName   EstRowCount   RowCount
        1             0          MLK           0          648
        1             0          INS           ?          ?
        1             0          INS           ?          ?
        1             0          INS           ?          ?
        2             1          RET           648        31752  <<<<<
        2             2          RET           1          1
        3             0          RET           2          1
        4             0          RET           3          1
        5             0          RET           4          1
        6             0          RET           5          1
        7             0          RET           6          1
        8             0          RET           7          1
        9             0          RET           8          1
       10             1          SRD           0          8
       10             2          JIN           103032     5334336
       11             0          JIN           824256     42674688
       12             0          JIN           1195479    2071122778  <<<<<
       13             0          SUM           1195479    598649948
       14             0          SUM           1272       1049
       15             0          RET           1272       1049
       16             0          Edt           0          648
       17             0          RSF           ?          ?

The corresponding step is:

     Step2   1) We do an all-AMPs RETRIEVE step from
                    PROD_EDW_MIG.DIM_GEOGRAPHY in view VW_CIRCLE by way of an
                    all-rows scan with a condition of (
                    "(PROD_EDW_MIG.DIM_GEOGRAPHY in view VW_CIRCLE.CODE_TYPE =
                    'Circle') AND (((PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME > 'NATIONAL') AND
                    (PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME < 'NOT APPLICABLE')) OR
                    ((PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME < 'NATIONAL') OR
                    (((PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME > 'NOT APPLICABLE') AND
                    (PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME < 'NOT AVAILABLE')) OR
                    (PROD_EDW_MIG.DIM_GEOGRAPHY in view
                    VW_CIRCLE.FINAL_CIRCLE_NAME > 'NOT AVAILABLE'))))") into
                    Spool 9 (all_amps), which is duplicated on all AMPs.  The
                    size of Spool 9 is estimated with low confidence to be 648
                    rows (18,792 bytes).  The estimated time for this step is
                    0.03 seconds

This may be an issue with the optimizer but it may also be that your statistics are stale. I suggest you check the statistics on PROD_EDW_MIG.DIM_GEOGRAPHY.

-dave.clark



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023