|
Archives of the TeradataForumMessage Posted: Fri, 09 Dec 2011 @ 16:26:20 GMT
Hi, The problem may not be the PI, but out of date statistics and whatever the next join step is in your query. If a join is occurring between a relatively small table or the selection criteria from the table is such that the estimated number of rows is small. The optimizer will choose to retrieve this data first. Then depending on the join condition, rather than try to distribute the resulting rows across the AMPs, the optimizer will just duplicate the data set across all AMPs to have the data available for the join result. For example, I have seen this happen with Calendar tables where the PI is date. The Calendar is then joined to several different date columns in a transaction table. The calendar table is small, is read and distributed across all AMPs. Now all the possible dates from the calendar are available on each AMP to join to the different transaction dates appearing on all AMPs due to the distribution of the PI on the transaction table. If statistics are way off for the distributed data, it might be leading the optimizer to distribute a large data set as opposed to a small data set rather than selecting an alternate solution. Hope this helps. Nathan
| ||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | ||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | ||||||||||||||||||||||||||||||||||||||||||||||||