Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 21 Jan 2004 @ 15:18:32 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Misunderstanding of join index
 
From:   Anomy Anom

<-- Anonymously Posted: Wednesday, January 21, 2004 10:00 -->

I everybody,

I'm trying to experimente the join index and when I look in the explain plan, the system never using it. Something is probably missing in my understanding on JI consideration.

Let's explain my JI.

The table contains 500 000 000 rows for 8 years.

I try to do this small JI:

CREATE JOIN INDEX DB.T1_JI AS
SELECT FIELD_1,
             FIELD_2,
             SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
GROUP BY 1,2
PRIMARY INDEX (FIELD_2);

I collect the STATISTICS on T1_JI.FIELD_1 and T1_JI.FIELD_2. I do my testing with and without STATISTICS on the main table, I though it's going to help the optimiser to use the JI.

I do this SELECT statement :

SELECT FIELD_1,
             FIELD_2,
             SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
GROUP BY 1,2;

The result returns 54000 rows.

I got this EXPLAIN plan :

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct DB."pseudo table" for read on a RowHash to prevent global deadlock for DONNE_DNOML.T1.  
  2)Next, we lock DB.T1 for read.  
  3)We do an all-AMPs SUM step to aggregate from DB.T1 by way of an all-rows scan with a condition of ("DB.T1.PA_DAT > DATE '2002-12-31'"), and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 3. The input table will not be cached in memory, but it is eligible for synchronized scanning. The aggregate spool file will not be cached in memory. The size of Spool 3 is estimated with high confidence to be 65,716,964 rows. The estimated time for this step is 55 minutes and 8 seconds.  
  4)We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 1 is estimated with high confidence to be 65,716,964 rows. The estimated time for this step is 2 minutes and 58 seconds.  
  5)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  


I try several testing and the utilisation of the JI is never append :

SELECT FIELD_1,
             FIELD_2,
             SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
AND FIELD_2 = 96167
GROUP BY 1,2;
Explanation
--------------------------------------------------
 
  1)First, we lock a distinct DB."pseudo table" for read on a RowHash to prevent global deadlock for DB.T1.  
  2)Next, we lock DB.T1 for read.  
  3)We do an all-AMPs SUM step to aggregate from DB.T1 by way of an all-rows scan with a condition of ("(DB.T1.PA_DAT > DATE '2002-12-31') AND (DB.T1.FIELD_2 = 96167)"), and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 3. The input table will not be cached in memory, but it is eligible for synchronized scanning. The size of Spool 3 is estimated with high confidence to be 20,236 rows. The estimated time for this step is 29 minutes and 2 seconds.  
  4)We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 20,236 rows. The estimated time for this step is 0.12 seconds.  
  5)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  

What's wrong in my understanding????



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023