Left Outer Join and Explain Plan [From Aniruddha Mitra: Tue, 29 Jul 2003 @ 19:08 GMT]

Message Posted: Tue, 29 Jul 2003 @ 19:08:05 GMT

Subj:

Left Outer Join and Explain Plan

From:

Aniruddha Mitra

Hi:

We have 2 sqls that eventually returns the same number of rows, however one runs substantially longer than the other. I am little bit confused, as the sql with left outer join return quicker than the sql with normal join. It would be great if someone can explain.

Thanks

ps. Microstrategy engine creates some temp tables on the fly, and do not run stats. SO you might see some selection with low confidence. All other base tables has updated stats.

--------------------------------------------------------------------- Sql with normal join: select a11.COUNTRY_ID COUNTRY_ID, a12.COUNTRY_NAME COUNTRY_NAME, coalesce(pa1.COUNTRY_ID, pa2.COUNTRY_ID) COUNTRY_ID0, coalesce(pa1.DIST_CTR_ID, pa2.DIST_CTR_ID) DIST_CTR_ID, a11.DIST_CTR_NAME DIST_CTR_NAME, pa1.WJXBFS1 WJXBFS1, pa2.WJXBFS1 WJXBFS2 from ZZSP00 pa1 full outer join ZZSP01 pa2 on (pa1.COUNTRY_ID = pa2.COUNTRY_ID and pa1.DIST_CTR_ID = pa2.DIST_CTR_ID) join LU_DIST_CTR a11 on (coalesce(pa1.COUNTRY_ID, pa2.COUNTRY_ID) = a11.COUNTRY_ID and coalesce(pa1.DIST_CTR_ID, pa2.DIST_CTR_ID) = a11.DIST_CTR_ID) join LU_COUNTRY a12 on (a11.COUNTRY_ID = a12.COUNTRY_ID);

Explanation --------------------------------------------------

1)

First, we lock a distinct VMALL."pseudo table" for read on a RowHash to prevent global deadlock for VMALL.a11.

2)

Next, we lock a distinct VMALL."pseudo table" for read on a RowHash to prevent global deadlock for VMALL.a12.

3)

We lock VMALL.a11 for read, and we lock VMALL.a12 for read.

4)

We do an all-AMPs JOIN step from VMALL.pa1 by way of a RowHash match scan with no residual conditions, which is joined to VMALL.pa2. VMALL.pa1 and VMALL.pa2 are full outer joined using a merge join, with a join condition of ("(VMALL.pa1.DIST_CTR_ID = VMALL.pa2.DIST_CTR_ID) AND (VMALL.pa1.COUNTRY_ID = VMALL.pa2.COUNTRY_ID)"). The result goes into Spool 2 (all_amps), which is duplicated on all AMPs. The size of Spool 2 is estimated with low confidence to be 240 rows. The estimated time for this step is 0.15 seconds.

5)

We do an all-AMPs JOIN step from VMALL.a11 by way of an all-rows scan with no residual conditions, which is joined to Spool 2 (Last Use). VMALL.a11 and Spool 2 are joined using a product join, with a join condition of ("((( CASE WHEN NOT ({RightTable}.COUNTRY_ID IS NULL )THEN ({RightTable}.COUNTRY_ID) ELSE ({RightTable}.COUNTRY_ID) END ))= VMALL.a11.Country_Id) AND (((CASE WHEN NOT ({RightTable}.DIST_CTR_ID IS NULL )THEN ({RightTable}.DIST_CTR_ID) ELSE ({RightTable}.DIST_CTR_ID) END ))= VMALL.a11.Dist_Ctr_Id)"). The result goes into Spool 7 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 7 by row hash. The size of Spool 7 is estimated with index join confidence to be 30 rows. The estimated time for this step is 0.18 seconds.

6)

We do an all-AMPs JOIN step from VMALL.a12 by way of a RowHash match scan with no residual conditions, which is joined to Spool 7 (Last Use). VMALL.a12 and Spool 7 are joined using a merge join, with a join condition of ("Country_Id = VMALL.a12.Country_Id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with index join confidence to be 30 rows. The estimated time for this step is 0.13 seconds.

7)

Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.

->

The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.46 seconds. ------------------------------------------------------------------- Sql with left outer join: select a11.COUNTRY_ID COUNTRY_ID, a12.COUNTRY_NAME COUNTRY_NAME, coalesce(pa1.COUNTRY_ID, pa2.COUNTRY_ID) COUNTRY_ID0, coalesce(pa1.DIST_CTR_ID, pa2.DIST_CTR_ID) DIST_CTR_ID, a11.DIST_CTR_NAME DIST_CTR_NAME, pa1.WJXBFS1 WJXBFS1, pa2.WJXBFS1 WJXBFS2 from ZZSP00 pa1 full outer join ZZSP01 pa2 on (pa1.COUNTRY_ID = pa2.COUNTRY_ID and pa1.DIST_CTR_ID = pa2.DIST_CTR_ID) left outer join LU_DIST_CTR a11 on (coalesce(pa1.COUNTRY_ID, pa2.COUNTRY_ID) = a11.COUNTRY_ID and coalesce(pa1.DIST_CTR_ID, pa2.DIST_CTR_ID) = a11.DIST_CTR_ID) left outer join LU_COUNTRY a12 on (a11.COUNTRY_ID = a12.COUNTRY_ID);

1)

First, we lock a distinct VMALL."pseudo table" for read on a RowHash to prevent global deadlock for VMALL.a11.

2)

Next, we lock a distinct VMALL."pseudo table" for read on a RowHash to prevent global deadlock for VMALL.a12.

3)

We lock VMALL.a11 for read, and we lock VMALL.a12 for read.

4)

We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from VMALL.a11 by way of an all-rows scan with no residual conditions into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 17 rows. The estimated time for this step is 0.13 seconds.

2) We do an all-AMPs JOIN step from VMALL.pa1 by way of a RowHash match scan with no residual conditions, which is joined to VMALL.pa2. VMALL.pa1 and VMALL.pa2 are full outer joined using a merge join, with a join condition of ("(VMALL.pa1.DIST_CTR_ID = VMALL.pa2.DIST_CTR_ID) AND (VMALL.pa1.COUNTRY_ID = VMALL.pa2.COUNTRY_ID)"). The result goes into Spool 3 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 3 by row hash. The size of Spool 3 is estimated with low confidence to be 30 rows. The estimated time for this step is 0.15 seconds.

5)

We do an all-AMPs JOIN step from VMALL.a12 by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use). VMALL.a12 and Spool 2 are right outer joined using a merge join, with a join condition of ("Country_Id = VMALL.a12.Country_Id"). The result goes into Spool 8 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 8 by row hash. The size of Spool 8 is estimated with low confidence to be 17 rows. The estimated time for this step is 0.13 seconds.

6)

We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of a RowHash match scan, which is joined to Spool 8 (Last Use). Spool 3 and Spool 8 are left outer joined using a merge join, with a join condition of ("((( CASE WHEN NOT ({LeftTable}.DIST_CTR_ID IS NULL )THEN ({LeftTable}.DIST_CTR_ID) ELSE ({LeftTable}.DIST_CTR_ID) END ))= Dist_Ctr_Id) AND ((( CASE WHEN NOT ({LeftTable}.COUNTRY_ID IS NULL )THEN ({LeftTable}.COUNTRY_ID) ELSE ({LeftTable}.COUNTRY_ID) END ))= Country_Id)"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 43 rows. The estimated time for this step is 0.17 seconds.

7)

Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.

->

The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of statement 2. The total estimated time is 0.45 seconds.

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Tue, 29 Jul 2003 @ 19:08:05 GMT