Archives of the TeradataForum
Message Posted: Fri, 03 Jun 2005 @ 17:40:57 GMT
One warning on collecting statistics on PPI. I have seen these things happen when the PPI is a date column. I have not verified that this happens in all cases.
1) Make sure the statistics do not go stale. For dates you might have to collect after each load. Case in point. Stats are collected on PPI column Request_Date (date datatype) on 2005-06-02. No data for 2005-06-02 existed when stats were collected. Data for 2005-06- 02 was loaded early in morning on 2005-06-03. Query is run using filter of Request_date = '2005-06-02' and join to other tables. The optimizer looks at the stats and thinks no data exists for 2005-06-02. So chooses a suboptimal plan.
IF you recollect the stats on Request_dt the plan is much more optimal. I also tested the situation with no stats on Request_dt plan was more optimal. I have seen the same thing happen on non-PPI date columns.
2) using Sampled stats on PPI column request_dt caused much the same situation as in case 1. The data volume was fairly uniform across all partitions.
My recommendations for stats on PPI date columns:
1) Collect stats after each load of with new dates or 10% growth on existing date values. OR don't collect stats at all on the PPI date column.
2) don't use sampled stats on PPI date columns.
Don't take this as gospel. It has just been our experiences so far on V2R5.1 and below.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 23 Jun 2019|