Archives of the TeradataForum
Message Posted: Fri, 12 Nov 2010 @ 23:13:44 GMT
Here is another noob question, because the concept of "Partitioned Primary Indexes" (PPI) conflicts with that of indexing in general that I have -- about the claim that the advantage of PPI over PI (ie NPPI) is to avoid full table scans.
My understanding of indexing, in general, is that there will be some kind of (b-)trees involved. Once a field is indexed, searching for a record will descend a tree from the top to the bottom. Now,
For tables with a PPI, Teradata utilizes a 3-level partitioning scheme to distribute and later locate the data. The 3 levels are:
1. Rows are distributed across all AMPs (and accessed via the Primary Index) based upon DSW portion of the Row Hash.
2. At the AMP level, rows are first ordered by their partition number.
3. Within the partition, data rows are logically stored in Row ID sequence.
About the 3rd point, is it talking about the actual data rows from the table, or the index it self? If it is the actual data rows from the table, how can randomly inserted data rows be arranged so organized? How about any row being deleted?
"A query that requests "order information" (with a WHERE condition that specifies a range of dates) will result in a full table scan of the NPPI table"
For the above statement, the "full table scan", does it mean traversing the whole table or traversing the index tree only?
Thanks a lot.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 15 Jun 2023|