Archives of the TeradataForum
Message Posted: Thu, 04 Jan 2001 @ 20:26:57 GMT
| Subj: || || Union Conundrum |
| From: || || Jeremy Christiansen |
I am running a "union all" query that merges the result sets of two select statements. The query fails on insufficient spool. I am
surprised by this because the total amount of data that should be returned by the query is about half of my total spool. My total spool is
50 gig and the query should return about 27 gig. Now the first possible cause of this problem that comes to mind is a "hot amp". I checked
into this and it is not the case. I queried the peakspool in dbc.diskspace and the spool usage is more or less even. Furthermore, while
the query was running I checked the currentspool. Before the query failed, I caught it using 43 gigabytes. (Note: this is the only query
running under the userid I am logged on as.) I don't understand how it could be using 43 gigabytes. There's more, though. I ran each of
the select statements individually to see what would happen . Executing alone, each select statement ran fine and used the expected amount
of diskspace. In particular, the first select produced an answer set of 2.3 gig and the second 25 gig. I have included a copy of the
explain below. Essentially the query is very simple. The first step is to join two tables, by way of the same primary index, and then
build the result locally. The second step is to do a retrieve step from a table and then build that locally. That's it! I doubt that
anyone can help with this, given the limited amount of information I've provided. Nonetheless, I'm writing this in the hope that some
Teradata genius will point out an arcane feature of Teradata's internal machinery that might explain why this problem is occurring.
| ||1)||First, we lock a distinct SURSDB."pseudo table" for read on a RowHash to prevent global deadlock for SURSDB.Line.
| ||2)||Next, we lock a distinct SURSDB."pseudo table" for read on a RowHash to prevent global deadlock for SURSDB.Base.
| ||3)||We lock SURSDB.Line for read, and we lock SURSDB.Base for read.
| ||4)||We do an all-AMPs JOIN step from SURSDB.Base by way of a RowHash match scan with no residual conditions, which is joined to
SURSDB.Line. SURSDB.Base and SURSDB.Line are joined using a merge join, with a join condition of ("(SURSDB.Base.claimreferencenumber =
SURSDB.Line.claimreferencenumber) AND ((SURSDB.Base.claimlinenumber = SURSDB.Line.claimlinenumber) AND (SURSDB.Base.adjustmentcode =
SURSDB.Line.adjustmentcode ))"). The input table SURSDB.Base will not be cached in memory, but it is eligible for synchronized scanning.
The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 5,546,400 rows. The estimated
time for this step is 12 minutes and 40 seconds.
| ||5)||We do an all-AMPs RETRIEVE step from SURSDB.Base by way of an all-rows scan with no residual conditions into Spool 1, which is built
locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file
will not be cached in memory. The size of Spool 1 is estimated to be 51,657,984 rows. The estimated time for this step is 30 minutes and
| ||6)||Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| ||->||The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0 hours and 43 minutes
and 34 seconds.