|
|
Archives of the TeradataForum
Message Posted: Wed, 06 Mar 2002 @ 13:54:29 GMT
Subj: | | Re: What information does 'collect statistics' store? |
|
From: | | Geoffrey Rommel |
| When statistics are collected on a column does it store the actual number of rows for each individual column value, or does it store
an average number of rows per value? | |
It stores the actual number of rows for certain values (including the most frequently occurring value within each percentile), but not
for every value.
| e.g. A table has a date column which we have collected statistics on and we then load a new week's data to the table. If we
run a query on the table against the new week's data do the statistics say that there will be no rows in the table and create an execution
plan accordingly, or will it say that there is an average number of rows and create a different execution plan? | |
Neither. I have encountered this very situation. The stats will include the maximum value for the date, so the optimizer will know that
according to the stats there are no rows for your new week, but because it cannot rely completely on the stats it will almost certainly do a
full-table scan.
In our case, we had a table of over 400GB, and adding a new month's worth of data increased its size by only 1.9%, so the full-table scan
was a Bad Thing. The solution was to add "collect stats" to the monthly load job so that the stats on the date column would always be up to
date.
--wgr
| |