Archives of the TeradataForum

Message Posted: Sat, 06 Nov 2004 @ 08:11:46 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Can anyone throw some light on the difference...

From:		Walter, Todd A

I found this peculiar difference in the explain for same query but difference in number of values in the IN clause of the query.

If I submit the whole lot of values, the explain shows a ALL AMP Retrieve and the query takes ages, but the same query with about 25-30 odd values does a Single AMP retrieve.

On release V2R5.1 and prior, 254 is the limit of the length of the IN list after which the optimizer will switch to scan mode.

One of the features of V2R6.0 is a feature called IN list optimization. This feature will address the long IN lists by doing what many of you do manually today. It will be capable of creating a spool file from the IN list and then performing a join. It will also be capable of combining multiple IN lists against multiple columns into special new steps or into a spool file. These combined IN lists will also be used, possibly in combination with joins to other tables, to create the cartesian set for a join to the multi-column PI of another table (eg a fact table). There is a new limit of 1024 items or combinations.

     SEL ... FROM DAILY_SALES_TBL SALES, SKU_TBL SKU
     WHERE SALES.LOC_NBR IN (1,2,3) AND
                SALES.DAY_DT IN ('2004-01-15', '2004-01-31') AND
                    SALES.SKU_ID = SKU.SKU_ID AND
                SKU.CATG_NBR = 10;

Step 1) JOIN from sku by way of an all-rows scan with a condition of (sku.CATG_NBR = 10.)") to rows (DAY_DT,LOCN_NBR) values (DATE '2004-01-15', 1), (DATE '2004-01-31', 1), (DATE '2004-01-15', 2), (DATE '2004-01-31', 2), (DATE '2004-01-15',3), (DATE '2004-01-31', 3). sku and rows are joined using a product join, with a join condition of ("(1=1)"). The result goes into = Spool 2, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash.

Step 2) JOIN from Spool 2 to sales with no residual conditions. Spool 2 and sales are joined using a merge join, with a join condition of ("(sales.SKU_ID = SKU_ID) AND ((sales.DAY_DT = DAY_DT) AND (sales.LOCN_NBR = LOCN_NBR ))").

     SEL ... FROM DAILY_SALES_TBL SALES
     WHERE SALES.LOC_NBR IN (1,2,3) AND
                 SALES.SKU_ID IN (4,5,6);
     -- NUSI on SALES.SKU_ID

Step 1) INSERT into Spool 2 with rows (SKU_ID, LOCN_NBR) values (4., 1), (5., 1), (6., 1), (4., 2), (5., 2), (6., 2), (4., 3), (5., 3), (6., 3), which is redistributed by hash code to all AMPs.

Step 2) RETRIEVE from Spool 2 by way of an all-rows scan into Spool 3, which is duplicated on all AMPs. Then we do a SORT to order Spool 3 by row hash.

Step 3) JOIN from Spool 3 by way of a RowHash match scan to daily_sales_tbl by way of a traversal of index # 4 without accessing the base table. Spool 3 and daily_sales_tbl are joined using a merge join, with a join condition of ("(sku_id = daily_sales_tbl.sku_id) AND = (locn_nbr= daily_sales_tbl.locn_nbr)"). The result goes into Spool 4 which is built locally on the AMPs.

Step 4) RETRIEVE from daily_sales_tbl by way of row ids from Spool 4 with no residual conditions into Spool 1, which is built locally on the AMPs.


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference