|
|
Archives of the TeradataForum
Message Posted: Thu, 16 Aug 2007 @ 15:36:49 GMT
Subj: | | Only diff in Tables is PI column definition: |
|
From: | | Anomy Anom |
<-- Anonymously Posted: Thursday, August 16, 2007 09:23 -->
Three tables are identical in every respect except one: the unique PI column is defined as Decimal (9, 0) in two cases, as Integer in the
third. (NOTE: column spelling is identical.) In each case the PI is demog_key.
Does Teradata (particularly the Optimizer) treat the PI definitions as being identical? Will joins of the tables generate comparable
explains/access_paths? I thought that the DEC vs INT would create some significant differences.
I have done some testing with a simple query (below) using a fairly small table (100M rows)and have provided the Explains which are slightly
different. Note: the age_months column always contains NULL. The full table contains 56 columns. No Join Indexes.
Any experience or thoughts with this situation?
I have done some testing with approx 4M row table and found that the Perm Space Usage by AMP is identical (same number of rows and values in
columns). Note, stats have been collected on the columns present in the query(s) and are identical.
Query 1, demog_key (the PI column) is defined as Integer and Decimal(9,0) respectively.
locking row for access
select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
from d_demog_xxxint aa
,d_demog_xxxdec9 bb
where aa.demog_key = bb.demog_key
and aa.demog_key < 1000
;
Explanation
Explanation -------------------------------------------------- | |
| 1) | First, we lock db01.bb for access, and we lock db01.aa for access.
| |
| 2) | Next, we do an all-AMPs RETRIEVE step from db01.aa by way of an all-rows scan with a condition of ("db01.aa.DEMOG_KEY < 1000") into Spool 2
(all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 1,003
rows. The estimated time for this step is 4.65 seconds.
| |
| 3) | We do an all-AMPs JOIN step from db01.bb by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use)
by way of a RowHash match scan. db01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY = db01.bb.DEMOG_KEY").
The input table db01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps),
which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 1,003 rows. The estimated time for this step is
0.17 seconds.
| |
| 4) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of
statement 2. The total estimated time is 4.82 seconds.
| |
Query 2, both versions of table have demog_key (single column PI) defined as Decimal (9, 0)
locking row for access
select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
from d_demog aa /**** demog_key is
Decimal (9, 0) ****/
,d_demog_xxxdec9 bb
where aa.demog_key = bb.demog_key
and aa.demog_key < 1000
;
Explanation
Explanation -------------------------------------------------- | |
| 1) | First, we lock db01.bb for access, and we lock db01.aa for access.
| |
| 2) | Next, we do an all-AMPs RETRIEVE step from db01.aa by way of an all-rows scan with a condition of ("db01.aa.DEMOG_KEY < 1000.") into Spool
2 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 18,282
rows. The estimated time for this step is 4.66 seconds.
| |
| 3) | We do an all-AMPs JOIN step from db01.bb by way of a RowHash match scan with a condition of ("db01.bb.DEMOG_KEY <1000."), which is joined
to Spool 2 (Last Use) by way of a RowHash match scan. db01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY =
db01.bb.DEMOG_KEY"). The input table db01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into
Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 4,640 rows. The estimated
time for this step is 2.98 seconds.
| |
| 4) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of
statement 2. The total estimated time is 7.64 seconds.
| |
Query 3, demog_key is defined as Integer and Decimal(9, 0) respectively.
locking row for access
select aa.demog_key, aa.age_months, bb.demog_key, bb.age_months
from d_demog_xxxint aa
,d_demog bb /*** Decimal 9, 0 ) ***/
where aa.demog_key = bb.demog_key
and aa.demog_key < 1000
Explanation
Explanation -------------------------------------------------- | |
| 1) | First, we lock DB01.aa for access, and we lock DB01.bb for access.
| |
| 2) | Next, we do an all-AMPs RETRIEVE step from DB01.aa by way of an all-rows scan with a condition of ("DB01.aa.DEMOG_KEY < 1000") into Spool 2
(all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 2 is estimated with low confidence to be 1,003
rows. The estimated time for this step is 4.65 seconds.
| |
| 3) | We do an all-AMPs JOIN step from DB01.bb by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use)
by way of a RowHash match scan. DB01.bb and Spool 2 are joined using a merge join, with a join condition of ("DEMOG_KEY = DB01.bb.DEMOG_KEY").
The input table DB01.bb will not be cached in memory, but it is eligible for synchronized scanning. The result goes into Spool 1 (group_amps),
which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 1,003 rows. The estimated time for this step is
0.17 seconds.
| |
| 4) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. No rows are returned to the user as the result of
statement 2. The total estimated time is 4.82 seconds.
| |
Note: performing this query via self-joins of the tables generates one of the plans above (expected).
All versions of the table hold 3,648,866 rows and have identical space usage (by total and by amp).
I expected Query 1 and 3 to have identical explains, query 2 to have something different.
As always, thanks for sharing any insights.
Thanks
| |