Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 11 Aug 2005 @ 15:18:08 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Primary Index vs Single Condition
 
From:   Phil.Oldfield

I have a general question about query performance

We have a large (150 Gb) transaction table with a primary index of transaction_no(decimal(18,0)) - Regular Stats collected

I am trying to retrieve last 13 weeks of data from this table. This is about 25% of all the rows

Query1

Using a working table containing transaction_no's for the last 13 weeks and stats collected

Perform a primary index join to the large tranaction table

- Query takes 90 minutes

Explanation
--------------------------------------------------
 
  4)We do an all-AMPs JOIN step from PRODDB02.Sales_Transaction_Line by way of an all-rows scan with no residual conditions, which is joined to PRODDB99.SALES_TRANS_HEADER_2_WEEKS by way of an all-rows scan with no residual conditions. PRODDB02.Sales_Transaction_Line and PRODDB99.SALES_TRANS_HEADER_2_WEEKS are joined using an inclusion merge join, with a join condition of ("PRODDB02.Sales_Transaction_Line.TRANSACTION_NO = PRODDB99.SALES_TRANS_HEADER_2_WEEKS.TRANSACTION_NO"). The input tables PRODDB02.Sales_Transaction_Line and PRODDB99.SALES_TRANS_HEADER_2_WEEKS will not be cached in memory. The result goes into Spool 1 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 1 by row hash. The result spool file will not be cached in memory. The size of Spool 1 is estimated with low confidence to be 310,739,176 rows. The estimated time for this step is 34 minutes and 44 seconds.  
  4)We do an all-AMPs MERGE into PRODDB99.SALES_TRANS_LINE_2_WEEKS from Spool 1 (Last Use).  

Query 2

Apply a where condition on the Transaction date to the large transaction table

- Query takes 30 Minutes

Explanation
--------------------------------------------------
 
  2)Next, we do an all-AMPs RETRIEVE step from PRODDB02.Sales_Transaction_Line by way of an all-rows scan with a condition of ("(PRODDB02.Sales_Transaction_Line.TILL_TRANSACTION_DATE < DATE) AND (PRODDB02.Sales_Transaction_Line.TILL_TRANSACTION_DATE > ((DATE '2005-08- 11')- 91 ))") into Spool 1 (group_amps), which is built locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is estimated with high confidence to be 232,039,222 rows. The estimated time for this step is 18 minutes and 49 seconds.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 18 minutes and 49 seconds.  


Is it reasonable to assume that a simple condition on a table will retrieve rows quicker than a a primary index join ?

we are on v2r5



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023