Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 16 Oct 2001 @ 10:32:32 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Stats always desirable (was Join question)
 
From:   Petr Horsky

The detail just for those who want to go through it:

INSERT INTO ERROR_OVERVIEW
(KEY_ID_1
,Date_Valid
,Branch_Code
,SubBranch_Code
,DI_Application_Code
,Table_Name
,Error_Code
,Value
,Record_Active)

SELECT
  E.Account_Nbr
 ,E.DWE_DATE
 ,P.Acct_Branch
 ,P.Acct_Subbranch
 ,DI.DI_Application_Code
 ,'Loan_Account'
 ,E.Error_Code
 ,Coalesce(E.Value, '')
 ,Case when A.Live_Account='Y'
           then 'Y' else 'N' end as Record_Active

FROM ((DWKB_errors.Loan_Account E

  inner join DWKB.Customer_Account AS P
    on E.Account_Nbr =P.Account_Nbr)

  inner join DWKB.C_DI_Application AS DI
    on P.DI_Application_Nbr=DI.DI_Application_Nbr)

  inner join DWKB.C_Account_Status AS A
    on P.Acct_Status_Code=A.Acct_Status_Code

WHERE E.DWE_DATE= DWKB_META.MD_DWE_DATE.DWE_DATE;

help statistics dwkb_errors.Loan_Account

        Date      Time      Unique Values   Column Names
        01/10/07  20:54:23          8,559   Account_Nbr

help statistics dwkb.Customer_Account

        Date      Time      Unique Values   Column Names
        01/10/07  19:53:16      2,537,822   Account_Nbr
        01/10/07  19:53:34      1,570,692   Party_Id
        01/10/07  19:53:40              8   DI_Application_Nbr
        01/10/07  19:53:48            907   Account_Type
        01/10/07  19:53:55            469   General_Ledger_Code
        01/10/07  19:54:05            102   Acct_Branch
        01/10/07  19:54:12             15   Acct_Subbranch

* In fact, there are stats on Acct_Status_Code missing. I tested it and adding this stats does not improve the bad performance described.

help statistics dwkb.C_DI_Application

        Date      Time      Unique Values   Column Names
        01/10/07  19:54:44             10   DI_Application_Nbr

help statistics dwkb.C_Account_Status

        Date      Time      Unique Values   Column Names
        01/10/07  19:54:29             15   Acct_Status_Code
        01/10/07  19:54:30              2   Used_for_Deposits
        01/10/07  19:54:30              2   Used_for_Loans
        01/10/07  19:54:30              2   Live_Account

help statistics dwkb_meta.MD_DWE_Date

        Date      Time      Unique Values   Column Names
        01/10/15  05:42:57              1   DWE_Date

help statistics dwkb_errors.Error_Overview

        Date      Time      Unique Values   Column Names
        01/10/15  16:26:03        575,494   Key_Id_1

Explain without stats on dwkb_errors.Loan_Account.DWE_Date (26 secs):

       
  1)First, we lock a distinct DWKB_META."pseudo table" for read on a RowHash to prevent global deadlock for DWKB_META.MD_DWE_DATE.  
  2)Next, we lock a distinct DWKB_ERRORS."pseudo table" for write on a RowHash to prevent global deadlock for DWKB_ERRORS.ERROR_OVERVIEW.  
  3)We lock a distinct DWKB_errors."pseudo table" for read on a RowHash to prevent global deadlock for DWKB_errors.E.  
  4)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.P.  
  5)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.A.  
  6)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.DI.  
  7)We lock DWKB_META.MD_DWE_DATE for read, we lock DWKB_ERRORS.ERROR_OVERVIEW for write, we lock DWKB_errors.E for read, we lock DWKB.P for read, we lock DWKB.A for read, and we lock DWKB.DI for read.  
  8)We execute the following steps in parallel.  
   
  1) We do an all-AMPs RETRIEVE step from DWKB_META.MD_DWE_DATE by way of an all-rows scan with no residual conditions into Spool 2, which is duplicated on all AMPs. The size of Spool 2 is estimated with high confidence to be 24 rows. The estimated time for this step is 0.16 seconds.
 
   
  2) We do an all-AMPs RETRIEVE step from DWKB.DI by way of an all-rows scan with no residual conditions into Spool 3, which is duplicated on all AMPs. The size of Spool 3 is estimated with high confidence to be 240 rows. The estimated time for this step is 0.16 seconds.
 
   
  3) We do an all-AMPs RETRIEVE step from DWKB.P by way of an all-rows scan with no residual conditions into Spool 4, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 4 by row hash. The size of Spool 4 is estimated with high confidence to be 2,537,822 rows. The estimated time for this step is 1 minute and 40 seconds.
 
  9)We execute the following steps in parallel.  
   
  1) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to DWKB_errors.E. Spool 2 and DWKB_errors.E are joined using a product join, with a join condition of ("DWKB_errors.E.DWE_Date = Spool_2.DWE_Date"). The result goes into Spool 5, which is built locally on the AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with index join confidence to be 59,660 rows. The estimated time for this step is 2.00 seconds.  
 



* It will be about 8,500 rows *

       
   
  2) We do an all-AMPs JOIN step from DWKB.A by way of an all-rows scan with no residual conditions, which is joined to Spool 3 (Last Use). DWKB.A and Spool 3 are joined using a product join, with a join condition of ("(1=1)"). The result goes into Spool 6, which is built locally on the AMPs. Then we do a SORT to order Spool 6 by row hash. The size of Spool 6 is estimated with high confidence to be 150 rows. The estimated time for this step is 0.07 seconds.
 
  10)We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of a RowHash match scan, which is joined to Spool 5 (Last Use). Spool 4 and Spool 5 are joined using a merge join, with a join condition of ("Spool_5.Account_Nbr = Spool_4.Account_Nbr"). The result goes into Spool 7, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 7 by row hash. The size of Spool 7 is estimated with index join confidence to be 59,660 rows. The estimated time for this step is 11.71 seconds.  
  11)We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of a RowHash match scan, which is joined to Spool 7 (Last Use). Spool 6 and Spool 7 are joined using a merge join, with a join condition of ("(Spool_7.Acct_Status_Code = Spool_6.Acct_Status_Code) AND (Spool_7.DI_Application_Nbr = Spool_6.DI_Application_Nbr)"). The result goes into Spool 1, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 1 by row hash. The size of Spool 1 is estimated with index join confidence to be 59,660 rows. The estimated time for this step is 1 minute and 6 seconds.  
  12)We do a MERGE into DWKB_ERRORS.ERROR_OVERVIEW from Spool 1 (Last Use).  
  13)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  -> No rows are returned to the user as the result of statement 1.  


     collect statistics dwkb_errors.Loan_Account column DWE_Date;

     help statistics dwkb_errors.Loan_Account;

     Date      Time      Unique Values   Column Names
     01/10/15  18:09:03            208   DWE_Date
     01/10/07  20:54:23          8,559   Account_Nbr

Explain after adding stats on dwkb_errors.Loan_Account.DWE_Date (one hour 35 minutes):

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct DWKB_META."pseudo table" for read on a RowHash to prevent global deadlock for DWKB_META.MD_DWE_DATE.  
  2)Next, we lock a distinct DWKB_ERRORS."pseudo table" for write on a RowHash to prevent global deadlock for DWKB_ERRORS.ERROR_OVERVIEW.  
  3)We lock a distinct DWKB_errors."pseudo table" for read on a RowHash to prevent global deadlock for DWKB_errors.E.  
  4)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.P.  
  5)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.A.  
  6)We lock a distinct DWKB."pseudo table" for read on a RowHash to prevent global deadlock for DWKB.DI.  
  7)We lock DWKB_META.MD_DWE_DATE for read, we lock DWKB_ERRORS.ERROR_OVERVIEW for write, we lock DWKB_errors.E for read, we lock DWKB.P for read, we lock DWKB.A for read, and we lock DWKB.DI for read.  
  8)We do an all-AMPs RETRIEVE step from DWKB_META.MD_DWE_DATE by way of an all-rows scan with no residual conditions into Spool 2, which is duplicated on all AMPs. The size of Spool 2 is estimated with high confidence to be 24 rows. The estimated time for this step is 0.16 seconds.  
  9)We execute the following steps in parallel.  
   
  1) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to DWKB_errors.E. Spool 2 and DWKB_errors.E are joined using a product join, with a join condition of ("DWKB_errors.E.DWE_Date = Spool_2.DWE_Date"). The result goes into Spool 3, which is built locally on the AMPs. The size of Spool 3 is estimated with low confidence to be 287 rows. The estimated time for this step is 0.88 seconds.  
 



* No, it will be about 8,500 rows *

       
   
  2) We do an all-AMPs RETRIEVE step from DWKB.P by way of an all-rows scan with no residual conditions into Spool 4, which is redistributed by hash code to all AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The size of Spool 4 is estimated with high confidence to be 2,537,822 rows. The estimated time for this step is 59.89 seconds.
 
  10)We execute the following steps in parallel.  
   
  1) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to Spool 4 (Last Use). Spool 3 and Spool 4 are joined using a product join, with a join condition of ("Spool_3.Account_Nbr = Spool_4.Account_Nbr"). The result goes into Spool 5, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with low confidence to be 287 rows. The estimated time for this step is 32.57 seconds.
 
   
  2) We do an all-AMPs RETRIEVE step from DWKB.DI by way of an all-rows scan with no residual conditions into Spool 6, which is duplicated on all AMPs. Then we do a SORT to order Spool 6 by row hash. The size of Spool 6 is estimated with high confidence to be 240 rows. The estimated time for this step is 0.03 seconds.
 
  11)We do an all-AMPs JOIN step from DWKB.A by way of a RowHash match scan with no residual conditions, which is joined to Spool 5 (Last Use). DWKB.A and Spool 5 are joined using a merge join, with a join condition of ("Spool_5.Acct_Status_Code = DWKB.A.Acct_Status_Code"). The result goes into Spool 7, which is built locally on the AMPs. Then we do a SORT to order Spool 7 by row hash. The size of Spool 7 is estimated with index join confidence to be 287 rows. The estimated time for this step is 0.06 seconds.  
  12)We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of a RowHash match scan, which is joined to Spool 7 (Last Use). Spool 6 and Spool 7 are joined using a merge join, with a join condition of ("Spool_7.DI_Application_Nbr = Spool_6.DI_Application_Nbr"). The result goes into Spool 1, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 1 by row hash. The size of Spool 1 is estimated with index join confidence to be 287 rows. The estimated time for this step is 0.07 seconds.  
  13)We do a MERGE into DWKB_ERRORS.ERROR_OVERVIEW from Spool 1 (Last Use).  
  14)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->No rows are returned to the user as the result of statement 1.  



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023