|
|
Archives of the TeradataForum
Message Posted: Thu, 18 Jul 2002 @ 14:57:40 GMT
Subj: | | Re: The Optimizer and Multi Inlists |
|
From: | | David Wellman |
Why not try building your list of values in a derived table and then joining the derived table to the table with the multi-part PI. I
tried the following on a table (vedcuser04) which has two columns (col01 and col03) defined as the PI. My requiredment is to find rows
where col01 in ('david','fred') and col04 = 1000102.
I coded it as:
sel *
from vedcuser04
where (col01,col03)
in (select 'david' (char(30) ,character set latin ),1000102 (date)
from sys_calendar.calendar
where calendar_date=date
union all
select 'fred' (char(30) ,character set latin ),1000102 (date)
from sys_calendar.calendar
where calendar_date=date);
The explain shows that the table with the multi-part PI (vedcuser04) is accessed last and is accessed using a merge join on the PI
columns (Step 7). Note that table vedcuser04 is only @4000 rows on a 20-amp system so even with a 'small' table you get the right join
plan:
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct VEDB4NODE."pseudo table" for read on a RowHash to prevent global deadlock for VEDB4NODE.vedcuser04.
| |
| 2) | Next, we lock VEDB4NODE.vedcuser04 for read.
| |
| 3) | We do a single-AMP RETRIEVE step from SYS_CALENDAR.CALDATES by way of the unique primary index "SYS_CALENDAR.CALDATES.cdate = DATE"
with no residual conditions into Spool 1, which is built locally on that AMP. The size of Spool 1 is estimated with high confidence to be 1
row. The estimated time for this step is 0.15 seconds.
| |
| 4) | We do a single-AMP RETRIEVE step from SYS_CALENDAR.CALDATES by way of the unique primary index "SYS_CALENDAR.CALDATES.cdate = DATE"
with no residual conditions into Spool 1, which is built locally on that AMP. The size of Spool 1 is estimated with high confidence to be 2
rows. The estimated time for this step is 0.15 seconds.
| |
| 5) | We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of an all-rows scan into Spool 3, which is redistributed by hash code
to all AMPs. The size of Spool 3 is estimated with high confidence to be 2 rows. The estimated time for this step is 0.17 seconds.
| |
| 6) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 4, which is built locally on the AMPs.
Then we do a SORT to order Spool 4 by row hash. The size of Spool 4 is estimated with high confidence to be 1 row. The estimated time for
this step is 0.04 seconds.
| |
| 7) | We do an all-AMPs JOIN step from VEDB4NODE.vedcuser04 by way of an all-rows scan with no residual conditions, which is joined to
Spool 4 (Last Use). VEDB4NODE.vedcuser04 and Spool 4 are joined using an inclusion merge join, with a join condition of
("(VEDB4NODE.vedcuser04.COL01 =) AND (VEDB4NODE.vedcuser04.COL03=)"). The result goes into Spool 2, which is built locally on the AMPs.
The size of Spool 2 is estimated with low confidence to be 2 rows. The estimated time for this step is 0.20 seconds.
| |
| 8) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 2 are sent back to the user as the result of statement 1. The total estimated time is 0.72 seconds.
| |
To be able to use this in a 'real world' example you need to get at the application code, or you may have to set up your
application to create/populate a volatile table instead of a derived table, but having done one of those you'll probably find that you get
the desired explain plan and therefore performance.
Cheers,
Dave
Ward Analytics Ltd: Information in motion (www.ward-analytics.com)
| |