Archives of the TeradataForum
Message Posted: Fri, 24 Jan 2003 @ 13:47:02 GMT
Working on a 10 Node, 110 AMP 5150 system. I recently ran a query which involved the following steps:
... 4) We do an all-AMPs RETRIEVE step from XEIWTMPT.DIMALLC by way of an all-rows scan with no residual conditions into Spool 3 fanned out into 5 hash join partitions, which is built locally on the AMPs. The size of Spool 3 is estimated with high confidence to be 3,493,605 rows. The estimated time for this step is 23.15 seconds. 5) We do an all-AMPs RETRIEVE step from XEIWTMPT.ELEMSMRY by way of an all-rows scan with no residual conditions into Spool 4 fanned out into 5 hash join partitions, which is duplicated on all AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 4 is estimated with high confidence to be 118,494,640 rows. The estimated time for this step is 2 hours and 4 minutes. 6) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to Spool 4 (Last Use). Spool 3 and Spool 4 are joined using a hash join of 5 partitions, with a join condition of ("WBS_ELEM_CD = WBS_ELEM_CD"). The result goes into Spool 2, which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 2 is estimated with low confidence to be 141,273,890 rows. The estimated time for this step is 40 minutes and 48 seconds. ...
In summary, I'm joining 2 tables together on a column called 'Wbs_Elem_Cd.' With a little bit of prodding, I've coercing the Optimizer to duplicate my ELEM_SMRY (1,077,224 records) table to all AMPs via 5 hash join paritions. My other table ,DIM_ALLC (3,493,605 records), is fanned into 5 hash join partitions AMP locally. The resulting 2 spool files are joined together via a hash join of 5 partitions.
I know that this final join should be almost perfectly distrubted (based upon results of queries I executed testing the distribution of the respective Primary Keys and join columns.)
However, as I watch this query, I noticed that 11 of the 110 amps have a spool file ~2x the size of all the other AMPs' spool files. Is this a result of the 'Hash Join of 5 partitions.' If so, this seems like a serious problem with the way hash joins are executed.
Any help would be greatly appreciated.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|