Re: V2R5.1 dynamic partition elimination [From Wayne Schachtel: Thu, 07 Apr 2005 @ 13:53 GMT]

Message Posted: Thu, 07 Apr 2005 @ 13:53:10 GMT

Subj:

Re: V2R5.1 dynamic partition elimination

From:

Wayne Schachtel

I was lead to believe in the software release below - dynamic partition elimination would work. It appears partitioned values in a predicate still need to be explicitly passed. Can anyone shed any light on this topic?

InfoKey InfoData RELEASE V2R.05.01.02.03 VERSION 05.01.02.05 CREATE SET TABLE prdpos.pos_txn ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( txn_id INTEGER, str_nbr INTEGER, per_id INTEGER, register_nbr INTEGER, txn_type BYTEINT, txn_nbr INTEGER) PRIMARY INDEX ( txn_id ) PARTITION BY RANGE_N(per_id BETWEEN 1 AND 365 EACH 1 ); CREATE SET TABLE prdpos.dim_time ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( per_id INTEGER, cal_date DATE FORMAT 'YYYY-MM-DD') UNIQUE PRIMARY INDEX ( per_id ); EXAMPLES 1 - 3 : DYNAMIC PARTITION ELIMINATION NOT WORKING example 1 select a.cal_date, b.txn_id, b.str_nbr, b.register_nbr, b.txn_type, b.txn_nbr from dim_time a join pos_txn b on a.per_id = b.per_id where a.cal_date between '2005-04-03' and '2005-04-04' ; Explanation 1) First, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.b. 2) Next, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.a. 3) We lock PRDPOS.b for read, and we lock PRDPOS.a for read. 4) We do an all-AMPs RETRIEVE step from all partitions of PRDPOS.b with a condition of ("NOT (PRDPOS.b.per_id IS NULL)") into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.01 seconds. 5) We do an all-AMPs JOIN step from PRDPOS.a by way of a RowHash match scan with a condition of ("(PRDPOS.a.cal_date <= DATE '2005-04-04') AND (PRDPOS.a.cal_date >= DATE '2005-04-03')"), which is joined to Spool 2 (Last Use) by way of a RowHash match scan. PRDPOS.a and Spool 2 are joined using a merge join, with a join condition of ("PRDPOS.a.per_id = per_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 10 rows. The estimated time for this step is 0.03 seconds. 6) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.04 seconds. example 2 select a.cal_date, b.txn_id, b.str_nbr, b.register_nbr, b.txn_type, b.txn_nbr from dim_time a join pos_txn b on a.per_id = b.per_id where b.per_id in (select per_id from dim_time where cal_date between '2005-04-03' and '2005-04-04' ) Explanation 1) First, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.b. 2) Next, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.ws_dim_time. 3) We lock PRDPOS.b for read, and we lock PRDPOS.ws_dim_time for read. 4) We execute the following steps in parallel. 1) We do an all-AMPs RETRIEVE step from all partitions of PRDPOS.b with a condition of ("NOT (PRDPOS.b.per_id IS NULL)") into Spool 2 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.01 seconds. 2) We do an all-AMPs RETRIEVE step from PRDPOS.ws_dim_time by way of an all-rows scan with a condition of ( "(PRDPOS.ws_dim_time.cal_date <= DATE '2005-04-04') AND (PRDPOS.ws_dim_time.cal_date >= DATE '2005-04-03')") into Spool 3 (all_amps), which is duplicated on all AMPs. Then we do a SORT to order Spool 3 by row hash. The size of Spool 3 is estimated with no confidence to be 28 rows. The estimated time for this step is 0.01 seconds. 5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to Spool 3 (Last Use) by way of an all-rows scan. Spool 2 and Spool 3 are joined using an inclusion merge join, with a join condition of ("per_id = per_id"). The result goes into Spool 4 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 4 by row hash. The size of Spool 4 is estimated with index join confidence to be 10 rows. The estimated time for this step is 0.01 seconds. 6) We do an all-AMPs JOIN step from PRDPOS.a by way of a RowHash match scan with no residual conditions, which is joined to Spool 4 (Last Use) by way of a RowHash match scan. PRDPOS.a and Spool 4 are joined using a merge join, with a join condition of ( "PRDPOS.a.per_id = per_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with index join confidence to be 10 rows. The estimated time for this step is 0.03 seconds. 7) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.04 seconds. example 3 select a.cal_date, b.txn_id, b.str_nbr, b.register_nbr, b.txn_type, b.txn_nbr from dim_time a join pos_txn b on a.per_id = b.per_id where b.per_id >= (select per_id from dim_time where cal_date = '2005-04-04') and b.per_id <= (select per_id from dim_time where cal_date = '2005-04-05'); Explanation 1) First, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.b. 2) Next, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.ws_dim_time. 3) We lock PRDPOS.b for read, and we lock PRDPOS.ws_dim_time for read. 4) We do an all-AMPs RETRIEVE step from PRDPOS.ws_dim_time by way of an all-rows scan with a condition of ( "PRDPOS.ws_dim_time.cal_date = DATE '2005-04-04'") into Spool 2 (all_amps), which is built locally on the AMPs. The size of Spool 2 is estimated with no confidence to be 1 row. The estimated time for this step is 0.03 seconds. 5) We do an all-AMPs SUM step to aggregate from Spool 2 by way of an all-rows scan. Aggregate Intermediate Results are computed globally, then placed in Spool 3. 6) We execute the following steps in parallel. 1) We do an all-AMPs ABORT test from Spool 3 (Last Use) by way of an all-rows scan with a condition of ("Field_2 > 1"). 2) We do an all-AMPs RETRIEVE step from PRDPOS.ws_dim_time by way of an all-rows scan with a condition of ( "PRDPOS.ws_dim_time.cal_date = DATE '2005-04-05'") into Spool 5 (all_amps), which is built locally on the AMPs. The size of Spool 5 is estimated with no confidence to be 1 row. The estimated time for this step is 0.03 seconds. 7) We do an all-AMPs SUM step to aggregate from Spool 5 by way of an all-rows scan. Aggregate Intermediate Results are computed globally, then placed in Spool 6. 8) We execute the following steps in parallel. 1) We do an all-AMPs ABORT test from Spool 6 (Last Use) by way of an all-rows scan with a condition of ("Field_2 > 1"). 2) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of an all-rows scan into Spool 8 (all_amps), which is duplicated on all AMPs. The size of Spool 8 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.03 seconds. 9) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an all-rows scan, which is joined to all partitions of PRDPOS.b with a condition of ("NOT (PRDPOS.b.per_id IS NULL)"). Spool 8 and PRDPOS.b are joined using a product join, with a join condition of ("PRDPOS.b.per_id <= per_id"). The result goes into Spool 9 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 9 by row hash. The size of Spool 9 is estimated with no confidence to be 9 rows. The estimated time for this step is 0.02 seconds. 10) We do an all-AMPs JOIN step from PRDPOS.a by way of a RowHash match scan with no residual conditions, which is joined to Spool 9 (Last Use) by way of a RowHash match scan. PRDPOS.a and Spool 9 are joined using a merge join, with a join condition of ( "PRDPOS.a.per_id = per_id"). The result goes into Spool 10 (all_amps), which is duplicated on all AMPs. The size of Spool 10 is estimated with no confidence to be 280 rows. The estimated time for this step is 0.03 seconds. 11) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to Spool 10 (Last Use) by way of an all-rows scan. Spool 2 and Spool 10 are joined using a product join, with a join condition of ("per_id >= per_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 3 rows. The estimated time for this step is 0.04 seconds. 12) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. EXAMPLE 4 - PARTITION ELIMINATION DOES WORK select a.cal_date, b.txn_id, b.str_nbr, b.register_nbr, b.txn_type, b.txn_nbr from dim_time a join pos_txn b on a.per_id = b.per_id where b.per_id between 1 and 3 ; Explanation 1) First, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.b. 2) Next, we lock a distinct PRDPOS."pseudo table" for read on a RowHash to prevent global deadlock for PRDPOS.a. 3) We lock PRDPOS.b for read, and we lock PRDPOS.a for read. 4) We do an all-AMPs RETRIEVE step from 3 partitions of PRDPOS.b with a condition of ("(NOT (PRDPOS.b.per_id IS NULL )) AND ((PRDPOS.b.per_id <= 3) AND (PRDPOS.b.per_id >= 1 ))") into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.01 seconds. 5) We do an all-AMPs JOIN step from PRDPOS.a by way of a RowHash match scan with a condition of ("(PRDPOS.a.per_id >= 1) AND (PRDPOS.a.per_id <= 3)"), which is joined to Spool 2 (Last Use) by way of a RowHash match scan. PRDPOS.a and Spool 2 are joined using a merge join, with a join condition of ("PRDPOS.a.per_id = per_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 32 rows. The estimated time for this step is 0.03 seconds. 6) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.04 seconds.

Wayne Schachtel
Wayne Schachtel Consulting

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Thu, 07 Apr 2005 @ 13:53:10 GMT