|
Archives of the TeradataForumMessage Posted: Mon, 28 Mar 2016 @ 12:41:04 GMT
Anomy.Anom wrote:
Because you want fast queries, when you implement all PK as UPI you get a perfectly distributed system with horrible performance :) Criteria for Choosing the Primary Index Access = Maximize single-AMP operations - Choose the column(s) most frequently used for JOIN (and WHERE conditions), usually a Primary or Foreign Key Distribution = Optimize parallel processing - Choose the column(s) that provides good distribution - Better when more unique values and less rows per value - Adding more columns might result in better distribution but less usability: The RowHash can only be calculated when all values are exactly known Volatility = Reduce maintenance resource overhead (I/O) - Choose a column with stable data values - An UPDATE of the PI column effectively results in: The Primary Key of a table is always a perfect fit based on distribution and volatility, as it's unique and stable. But access is the most important criterion and a PK might not be accessed at all. So data distribution has to be balanced with access usage in choosing a PI. <\cut & paste> Many PKs only exist to guarantee uniqueness in your logical model, but are hardly used in WHERE or JOIN-conditions. I usually don't consider implementing them as USI as it's just CPU/IO/Perm overhead and your ETL process has to check for PK-violations anyway. In fact those rules are quite similar to choosing a Clustered Index in other DBMSes. Dieter
| |||||||||||||||||||||||||||||||||||||||||||||||||||
https: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | |||||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 24 Jul 2020 | |||||||||||||||||||||||||||||||||||||||||||||||||||