Re: Avoiding an all-row scan [From mjrodriguezp: Tue, 23 Feb 2010 @ 09:13 GMT]

Message Posted: Tue, 23 Feb 2010 @ 09:13:24 GMT

Subj:

Re: Avoiding an all-row scan

From:

mjrodriguezp

Thanks a lot!!

That was it. Very usefull.

I had de collects on the biggest table but no in the smallest, so I did, and now one of the querys works fine.

HELP STATS BDD_TABLAS.FIN_BORRAR_1; HELP STATS BDP_TABLAS.PCL_HIS_CONTRATOS;

With this done, the inner join querys works in a short time (not the other one but thatB4s fine with me):

--15 s SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a INNER JOIN BDD_TABLAS.FIN_BORRAR_1 B ON (A.fec_vigencia = B.fec_vigencia ); --Too long SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a WHERE fec_vigencia IN (SELECT fec_vigencia FROM bdd_tablas.fin_borrar_1) ; Also, I tried with the view instead of the table, and worked fine too: CREATE VIEW FIN_BORRAR_2 AS SELECT DATE '2009-12-31' AS fec_vigencia; --15 s SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a INNER JOIN BDD_TABLAS.FIN_BORRAR_2 B ON (A.fec_vigencia = B.fec_vigencia );

If you want to have a look to the explains, here they are:

SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a INNER JOIN BDD_TABLAS.FIN_BORRAR_1 B ON (A.fec_vigencia = B.fec_vigencia ); Explanation 1) First, we lock a distinct BDP_tABLAS."pseudo table" for read on a RowHash to prevent global deadlock for BDP_tABLAS.a. 2) Next, we lock a distinct BDD_TABLAS."pseudo table" for read on a RowHash to prevent global deadlock for BDD_TABLAS.B. 3) We lock BDP_tABLAS.a for read, and we lock BDD_TABLAS.B for read. 4) We do an all-AMPs RETRIEVE step from BDD_TABLAS.B by way of an all-rows scan with a condition of ("NOT (BDD_TABLAS.B.fec_vigencia IS NULL)") into Spool 2 (all_amps) (compressed columns allowed), which is duplicated on all AMPs. Then we do a SORT to partition by rowkey. The size of Spool 2 is estimated with high confidence to be 28 rows (1,372 bytes). The estimated time for this step is 0.01 seconds. 5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to BDP_tABLAS.a. Spool 2 and BDP_tABLAS.a are joined using a product join, with a join condition of ("BDP_tABLAS.a.fec_vigencia = fec_vigencia") enhanced by dynamic partition elimination. The input table BDP_tABLAS.a will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 1 is estimated with low confidence to be 5,634,724 rows (1,831,285,300 bytes). The estimated time for this step is 6.05 seconds. 6) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 6.06 seconds. SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a WHERE fec_vigencia IN (SELECT fec_vigencia FROM bdd_tablas.fin_borrar_1) Explanation 1) First, we lock a distinct BDP_tABLAS."pseudo table" for read on a RowHash to prevent global deadlock for BDP_tABLAS.a. 2) Next, we lock a distinct bdd_tablas."pseudo table" for read on a RowHash to prevent global deadlock for bdd_tablas.fin_borrar_1. 3) We lock BDP_tABLAS.a for read, and we lock bdd_tablas.fin_borrar_1 for read. 4) We execute the following steps in parallel. 1) We do an all-AMPs RETRIEVE step from BDP_tABLAS.a by way of an all-rows scan with no residual conditions into Spool 2 (all_amps) (compressed columns allowed), which is built locally on the AMPs. Then we do a SORT to order Spool 2 by the hash code of (BDP_tABLAS.a.fec_vigencia). The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 2 is estimated with high confidence to be 445,143,188 rows (125,085,235,828 bytes). The estimated time for this step is 31 minutes and 40 seconds. 2) We do an all-AMPs RETRIEVE step from bdd_tablas.fin_borrar_1 by way of an all-rows scan with no residual conditions into Spool 4 (all_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 4 by the sort key in spool field1 (bdd_tablas.fin_borrar_1.fec_vigencia) eliminating duplicate rows. The size of Spool 4 is estimated with high confidence to be 1 row (25 bytes). The estimated time for this step is 0.01 seconds. 5) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of an all-rows scan into Spool 3 (all_amps) (compressed columns allowed), which is duplicated on all AMPs. Then we do a SORT to order Spool 3 by the hash code of ( bdd_tablas.fin_borrar_1.fec_vigencia). The size of Spool 3 is estimated with high confidence to be 28 rows (700 bytes). 6) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to Spool 3 (Last Use) by way of an all-rows scan. Spool 2 and Spool 3 are joined using an inclusion merge join, with a join condition of ("fec_vigencia = fec_vigencia"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 1 is estimated with low confidence to be 5,299,324 rows (1,531,504,636 bytes). The estimated time for this step is 6.35 seconds. 7) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. SELECT * FROM BDP_tABLAS.PCL_HIS_CONTRATOS a INNER JOIN BDD_TABLAS.FIN_BORRAR_2 B ON (A.fec_vigencia = B.fec_vigencia ); Explanation 1) First, we lock a distinct BDP_tABLAS."pseudo table" for read on a RowHash to prevent global deadlock for BDP_tABLAS.a. 2) Next, we lock BDP_tABLAS.a for read. 3) We do an all-AMPs RETRIEVE step from a single partition of BDP_tABLAS.a with a condition of ("BDP_tABLAS.A.fec_vigencia = DATE '2009-12-31'") with a residual condition of ( "BDP_tABLAS.A.fec_vigencia = DATE '2009-12-31'") into Spool 2 (group_amps), which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 2 is estimated with high confidence to be 5,909,692 rows ( 1,731,539,756 bytes). The estimated time for this step is 4.88 seconds. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 2 are sent back to the user as the result of statement 1. The total estimated time is 4.88 seconds.

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Tue, 23 Feb 2010 @ 09:13:24 GMT