|
|
Archives of the TeradataForum
Message Posted: Sun, 10 Sep 2002 @ 18:14:31 GMT
Subj: | | OLAP function Spool Usage |
|
From: | | Jeremy Christiansen |
I am running some queries on a large table in our system and am noticing some seemingly bizarre spool usage. During query execution a
large amount of data is going to just a few amps, creating a "hot amp" situation. Moreover, the total amount of spool being used ranges
from a little over the size of the table being queried to three times that much (at different points during execution). I have narrowed
down the problem to the OLAP functions I am using. Here is an example of a test query that produces the same spool problems as the original
query.
sel A.*
,CSUM(1,A.COLUMN_1,A.COLUMN_2)
from TABLE_1 A
group by COLUMN_1;
Table "A" is not a view and is evenly distributed on column 1. Column 1 is the non-unique primary index. The logical key to the table
is columns 1 and 2 (there is no corresponding index, but if the table were redistributed on these columns, the data would still be perfectly
even).
Okay. Consider this: when I just select all data (no OLAP function) from the table, I get about 215 megs of data per amp (48 amps in
all). When I add the OLAP function, data gets shuffled around such that at some points (during execution) the busiest amp will have 1,000+
megs while the least busy will have close to 18.
I don't get why data is being shuffled around at all. I am grouping by the primary index, so it would seem that each amp has all the
rows it needs before the query even starts. Again, the table is already distributed on column 1 and that is the column on which I am basing
the csum function. It would seem that all each amp would have to do is sort its own data and compute the csum value. Why data is being
passed all over place, and to some places more than others, is beyond me. Also, even if for some strange reason the data was being re-
distributed on the sort columns (which I would argue is unnecessary in this case), there should still be no skew. Yet there is.
I would appreciate any explanatory and/or helpful thoughts. Also, as a side question, can anyone tell me what it means for an explain to
state "which is assumed to be redistributed by value to all AMPs".
Here is the explain:
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct beneficiary."pseudo table" for read on a RowHash to prevent global deadlock for beneficiary.a.
| |
| 2) | Next, we lock beneficiary.a for read.
| |
| 3) | We do an all-AMPs RETRIEVE step from beneficiary.a by way of an all-rows scan with no residual conditions into Spool 2, which is
built locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool
file will not be cached in memory. The size of Spool 2 is estimated with high confidence to be 77,173,397 rows. The estimated time for
this step is 20 minutes and 10 seconds.
| |
| 4) | We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by way of an all-rows scan into Spool 5 (Last Use), which is assumed to
be redistributed by value to all AMPs. The result rows are put into Spool 1, which is built locally on the AMPs.
| |
| 5) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1.
| |
| |