Archives of the TeradataForum
Message Posted: Wed, 20 Aug 2002 @ 17:33:45 GMT
I have a question on PI and statistics.
From a small table 'orig_tbl' (ca. 700,000rows) with 3 columns col1, col2 and col3, I created two new tables with different PI as following: create table tblA as (select * from orig_tbl) with data primary index (col1); collect statistics on tblA column col3; create table tblB as (select * from orig_tbl) with data primary index (col2); collect statistics on tblB column col3;
Then I did:
explain select * from tblA where col3 = valueX;
My Teradata V2R4.1 system said:
[snip] 3) We do an all-AMPs RETRIEVE step from tblA by way of an all-rows scan with a condition of (tblA.col3 = valueX) into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 1 row. [snip]
But it claimed that the size of Spool 1 is estimated to be 4,855 rows, when I did the same on tblB.
The difference may come from the fact that col2 has 700,000 different values (almost unique) while col1 has only 150 different values. But I can't imagine what is actually happening.
Could someone please explain why?
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|