Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Fri, 23 Aug 2002 @ 08:00:07 GMT

  <Prev Next>   <<First <Prev Next> Last>>  

Subj:   Re: PI and statistics
From:   Hiroshi Takenaka


After my last posting, some people gave me advices. I also did further experiments and figured out:

* If I didn't collect stats on selection columns, the optimizer seems to choose 10% of total records as default estimated # of rows selected.

(The optimizer knows the exact number of total records if I collect stats on PI, or it does random sampling if there is no stats on PI)

* If I collect stats on the selection column, the optimizer seems to choose / <# of different values in the col> instead of 10% of total records as default guess. Everytime when I selected a non-existent value, the optimizer used that value.

* Even if I select a value which actually exists in the column, somehow the optimizer occasionally falls back to the default.

In my test data, with 83 out of 143 different values the optimizer estimates # of selected rows almost accurately, so it seems to use stats on that column. But with other 60 values, it falls back to the default: / <# of different values>.

Those 60 values seem to be values with small # of records, but not the 60 values with least records. It seems that those values are choosed somewhat randomly (depending on its PI?).

I'll be happy if the optimizer will make a bit better guess than /<# of different values>. Any further info will appreciate.


Hiroshi Takenaka

  <Prev Next>   <<First <Prev Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 28 Jun 2020