Re: Too much time spent on aggregate function [From Simard Rudel: Mon, 23 Feb 2009 @ 15:20 GMT]

Message Posted: Mon, 23 Feb 2009 @ 15:20:26 GMT

Subj:

Re: Too much time spent on aggregate function

From:

Simard Rudel

I did some text with exemple you show me and unfortunately MAX function run faster than the proposition.

Interesting to see the difference between explain plan.

I included the explain on the query and the number I/O and total CPU time from DBQL :

explain SELECT NO_VIV ,COD ,D_PER ,PER FROM TADR qualify rank() over (partition by NO_VIV, COD, D_PER order by PER)= 1; Explanation 1) First, we lock a distinct donne."pseudo table" for read on a RowHash to prevent global deadlock for donne.TADR. 2) Next, we lock donne.TADR for read. 3) We do an all-AMPs STAT FUNCTION step from donne.TADR by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is redistributed by hash code to all AMPs. The result rows are put into Spool 1 (group_amps), which is built locally on the AMPs. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. Number I/O : 298 431 Total CPU time : 187,89 SELECT NO_VIV ,COD ,D_PER ,PER FROM TADR qualify PER = max(PER) over (partition by NO_VIV, COD, D_PER); Explanation 1) First, we lock a distinct donne."pseudo table" for read on a RowHash to prevent global deadlock for donne.TADR. 2) Next, we lock donne.TADR for read. 3) We do an all-AMPs STAT FUNCTION step from donne.TADR by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is assumed to be redistributed by value to all AMPs. The result rows are put into Spool 1 (group_amps), which is built locally on the AMPs. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. Number I/O : 303 650,00 Total CPU time : 197,10 Explain SELECT NO_VIV ,COD ,D_PER ,MAX(PER) FROM TADR group by NO_VIV, COD, D_PER ; Explanation 1) First, we lock a distinct donne."pseudo table" for read on a RowHash to prevent global deadlock for donne.TADR. 2) Next, we lock donne.TADR for read. 3) We do an all-AMPs SUM step to aggregate from donne.TADR by way of an all-rows scan with no residual conditions, and the grouping identifier in field 1. Aggregate Intermediate Results are computed locally, then placed in Spool 3. The input table will not be cached in memory, but it is eligible for synchronized scanning. The aggregate spool file will not be cached in memory. The size of Spool 3 is estimated with low confidence to be 22,584,450 rows. The estimated time for this step is 28.42 seconds. 4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 22,584,450 rows. The estimated time for this step is 2.29 seconds. 5) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 30.71 seconds. Number I/O : 172 351,00 Total CPU time : 148,83 303 650,00 197,10

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Mon, 23 Feb 2009 @ 15:20:26 GMT