|
|
Archives of the TeradataForum
Message Posted: Fri, 29 Jun 2007 @ 23:09:49 GMT
Subj: | | How to get confidence on derived spool |
|
From: | | Kopuuravuri, Ramu |
Hi,
I have a major volume of data in history (DBviews.stmnt_sub) tables with partition on date range.
I have a scenario to take 3 partitions data into 2 different spools. And do the left outer join between these two spools.
I'm getting no confidence on Join spool. Pls provide best practice solution to active this prob.
Query :
sel a.id, a.cyc_dt, b.bill_yy_mm, date_rank, b.curr_ind
from DBviews.stmnt_sub a
left join
(
sel id, bill_yy_mm
, curr_ind
, RANK() OVER (partition by id ORDER BY bill_yy_mm desc) as date_rank
from
(
sel id, substr(cyc_dt,1,7) as bill_yy_mm, max(curr_ind) as curr_ind
from DBviews.stmnt_sub
where confirm_ind='Y'
and cyc_dt between cast('2007-03-01' as DATE FORMAT 'YYYY-MM-DD')
and cast('2007-05-31' as DATE FORMAT 'YYYY-MM-DD')
and id > 0
group by 1,2
) a
QUALIFY SUM(1) Over(
PARTITION BY id, bill_yy_mm
ORDER BY bill_yy_mm
ROWS UNBOUNDED PRECEDING
) = 1
) b
on a.id=b.id
and a.cyc_dt between cast('2007-03-01' as DATE FORMAT 'YYYY-MM-DD')
and cast('2007-05-31' as DATE FORMAT 'YYYY-MM-DD')
and a.id > 0
where
a.cyc_dt between ADD_MONTHS(a.cyc_dt, -4) and current_date
and a.id > 0
and substr(a.cyc_dt,1,7)=b.bill_yy_mm
Explanation
Explanation -------------------------------------------------- | |
| 1) | First, we lock DB.stmnt_sub for access.
| |
| 2) | Next, we do an all-AMPs SUM step to aggregate from 3 partitions of DB.stmnt_sub with a condition of ( "(DB.stmnt_sub.cyc_dt >= DATE '2007-
03-01') AND ((DB.stmnt_sub.cyc_dt <= DATE '2007-05-31') AND ((DB.stmnt_sub.confirm_ind = 'Y') AND (DB.stmnt_sub.id > 0 )))"), and the grouping
identifier in field 1. Aggregate Intermediate Results are computed locally, then placed in Spool 4. The input table will not be cached in
memory, but it is eligible for synchronized scanning. The size of Spool 4 is estimated with low confidence to be 135,646,877 rows. The estimated
time for this step is 5.74 seconds.
| |
| 3) | We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of an all-rows scan into Spool 1 (all_amps), which is built locally on the
AMPs. The size of Spool 1 is estimated with low confidence to be 135,646,877 rows. The estimated time for this step is 1.72 seconds.
| |
| 4) | We do an all-AMPs STAT FUNCTION step from Spool 1 (Last Use) by way of an all-rows scan into Spool 9 (Last Use), which is assumed to be
redistributed by value to all AMPs. The result rows are put into Spool 7 (all_amps), which is built locally on the AMPs.
| |
| 5) | We do an all-AMPs STAT FUNCTION step from Spool 7 (Last Use) by way of an all-rows scan into Spool 12 (Last Use), which is assumed to be
redistributed by value to all AMPs. The result rows are put into Spool 11 (all_amps), which is built locally on the AMPs.
| |
| 6) | We execute the following steps in parallel.
| |
| |
| 1) | We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of an all-rows scan with a condition of ("Field_6 = 1") into Spool 2
(all_amps), which is built locally on the AMPs. The size of Spool 2 is estimated with no confidence to be 135,646,877 rows. The estimated time
for this step is 1.82 seconds.
| | |
| |
| 2) | We do an all-AMPs RETRIEVE step from 3 partitions of DB.stmnt_sub with a condition of ( "(DB.stmnt_sub.cyc_dt >=
(ADD_MONTHS((DB.stmnt_sub.cyc_dt ),-4 ))) AND ((DB.stmnt_sub.cyc_dt <= DATE) AND ((DB.stmnt_sub.id > 0) AND ((DB.stmnt_sub.cyc_dt >= DATE '2007-
03-01') AND ((DB.stmnt_sub.cyc_dt <= DATE '2007-05-31') AND (DB.stmnt_sub.id > 0 )))))") locking for access into Spool 16 (all_amps), which is
built locally on the AMPs. The size of Spool 16 is estimated with no confidence to be 105,109,185 rows. The estimated time for this step is 2.90
seconds.
| | |
| 7) | We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of an all-rows scan into Spool 17 (all_amps), which is redistributed by hash
code to all AMPs. Then we do a SORT to order Spool 17 by row hash. The size of Spool 17 is estimated with no confidence to be 135,646,877 rows.
The estimated time for this step is 6.16 seconds.
| |
| 8) | We do an all-AMPs JOIN step from Spool 16 (Last Use) by way of a RowHash match scan, which is joined to Spool 17 (Last Use) by way of a
RowHash match scan. Spool 16 and Spool 17 are joined using a merge join, with a join condition of ("(id = id) AND ((SUBSTR(cyc_dt (VARCHAR(10),
CHARACTER SET LATIN, NOT CASESPECIFIC, FORMAT 'YYYY-MM-DD'),1 ,7 ))= BILL_YY_MM)"). The result goes into Spool 15 (group_amps), which is built
locally on the AMPs. The size of Spool 15 is estimated with no confidence to be 135,646,877 rows. The estimated time for this step is 2.54
seconds.
| |
| 9) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 15 are sent back to the user as the result of statement 1.
| |
Step 6 have no confidence, how to improve confidence level to at least low.
Appreciate for your suggestions.
Regards,
Ramu.
| |