|
|
Archives of the TeradataForum
Message Posted: Wed, 21 Jan 2004 @ 15:18:32 GMT
Subj: | | Misunderstanding of join index |
|
From: | | Anomy Anom |
<-- Anonymously Posted: Wednesday, January 21, 2004 10:00 -->
I everybody,
I'm trying to experimente the join index and when I look in the explain plan, the system never using it. Something is probably missing in my
understanding on JI consideration.
Let's explain my JI.
The table contains 500 000 000 rows for 8 years.
I try to do this small JI:
CREATE JOIN INDEX DB.T1_JI AS
SELECT FIELD_1,
FIELD_2,
SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
GROUP BY 1,2
PRIMARY INDEX (FIELD_2);
I collect the STATISTICS on T1_JI.FIELD_1 and T1_JI.FIELD_2. I do my testing with and without STATISTICS on the main table, I though
it's going to help the optimiser to use the JI.
I do this SELECT statement :
SELECT FIELD_1,
FIELD_2,
SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
GROUP BY 1,2;
The result returns 54000 rows.
I got this EXPLAIN plan :
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct DB."pseudo table" for read on a RowHash to prevent global deadlock for DONNE_DNOML.T1.
| |
| 2) | Next, we lock DB.T1 for read.
| |
| 3) | We do an all-AMPs SUM step to aggregate from DB.T1 by way of an all-rows scan with a condition of ("DB.T1.PA_DAT > DATE '2002-12-31'"), and
the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 3. The input table will not be
cached in memory, but it is eligible for synchronized scanning. The aggregate spool file will not be cached in memory. The size of Spool 3 is
estimated with high confidence to be 65,716,964 rows. The estimated time for this step is 55 minutes and 8 seconds.
| |
| 4) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the
AMPs. The result spool file will not be cached in memory. The size of Spool 1 is estimated with high confidence to be 65,716,964 rows. The
estimated time for this step is 2 minutes and 58 seconds.
| |
| 5) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
I try several testing and the utilisation of the JI is never append :
SELECT FIELD_1,
FIELD_2,
SUM(COST) (NAMED AGR_COST)
FROM DB.T1
WHERE PA_DAT > '2002-12-31'
AND FIELD_2 = 96167
GROUP BY 1,2;
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct DB."pseudo table" for read on a RowHash to prevent global deadlock for DB.T1.
| |
| 2) | Next, we lock DB.T1 for read.
| |
| 3) | We do an all-AMPs SUM step to aggregate from DB.T1 by way of an all-rows scan with a condition of ("(DB.T1.PA_DAT > DATE '2002-12-31') AND
(DB.T1.FIELD_2 = 96167)"), and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 3.
The input table will not be cached in memory, but it is eligible for synchronized scanning. The size of Spool 3 is estimated with high confidence
to be 20,236 rows. The estimated time for this step is 29 minutes and 2 seconds.
| |
| 4) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the
AMPs. The size of Spool 1 is estimated with high confidence to be 20,236 rows. The estimated time for this step is 0.12 seconds.
| |
| 5) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
What's wrong in my understanding????
| |