|
|
Archives of the TeradataForum
Message Posted: Mon, 23 Feb 2009 @ 15:20:26 GMT
Subj: | | Re: Too much time spent on aggregate function |
|
From: | | Simard Rudel |
I did some text with exemple you show me and unfortunately MAX function run
faster than the proposition.
Interesting to see the difference between explain plan.
I included the explain on the query and the number I/O and total CPU time from DBQL :
explain
SELECT NO_VIV
,COD
,D_PER
,PER
FROM TADR
qualify rank() over
(partition by NO_VIV, COD, D_PER
order by PER)= 1;
Explanation
1) First, we lock a distinct donne."pseudo table" for read on a
RowHash to prevent global deadlock for donne.TADR.
2) Next, we lock donne.TADR for read.
3) We do an all-AMPs STAT FUNCTION step from donne.TADR by way of
an all-rows scan with no residual conditions into Spool 5 (Last
Use), which is redistributed by hash code to all AMPs. The result
rows are put into Spool 1 (group_amps), which is built locally on
the AMPs.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
Number I/O : 298 431
Total CPU time : 187,89
SELECT NO_VIV
,COD
,D_PER
,PER
FROM TADR
qualify PER = max(PER) over
(partition by NO_VIV, COD, D_PER);
Explanation
1) First, we lock a distinct donne."pseudo table" for read on a
RowHash to prevent global deadlock for donne.TADR.
2) Next, we lock donne.TADR for read.
3) We do an all-AMPs STAT FUNCTION step from donne.TADR by way of
an all-rows scan with no residual conditions into Spool 5 (Last
Use), which is assumed to be redistributed by value to all AMPs.
The result rows are put into Spool 1 (group_amps), which is built
locally on the AMPs.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
Number I/O : 303 650,00
Total CPU time : 197,10
Explain
SELECT NO_VIV
,COD
,D_PER
,MAX(PER)
FROM TADR
group by NO_VIV, COD, D_PER ;
Explanation
1) First, we lock a distinct donne."pseudo table" for read on a
RowHash to prevent global deadlock for donne.TADR.
2) Next, we lock donne.TADR for read.
3) We do an all-AMPs SUM step to aggregate from donne.TADR by way
of an all-rows scan with no residual conditions, and the grouping
identifier in field 1. Aggregate Intermediate Results are
computed locally, then placed in Spool 3. The input table will
not be cached in memory, but it is eligible for synchronized
scanning. The aggregate spool file will not be cached in memory.
The size of Spool 3 is estimated with low confidence to be
22,584,450 rows. The estimated time for this step is 28.42
seconds.
4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with low confidence
to be 22,584,450 rows. The estimated time for this step is 2.29
seconds.
5) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 30.71 seconds.
Number I/O : 172 351,00
Total CPU time : 148,83
303 650,00
197,10
| |