Archives of the TeradataForum

Message Posted: Sat, 12 Oct 2002 @ 18:28:45 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Surrogate keys in Teradata

From:		Doug Drake

There are many tradeoffs to be considered. Let's assume the term NK stands for all of the columns of the natural key while SK stands for a single column surrogate key. While there are other design considerations I have assumed that the NK is NOT included in any tables other than a single cross reference table (which indicates the relationship of the SK to NK). Let me summarize my thoughts.

The pros -

1.) Often improves performance of a large set of queries due to less I/O (assuming the NK is excluded from all tables). More rows fit in each block.

2.) Improves performance on a large set of queries due to reduced join complexity. The impact of both this issue and #1 above can be significant. It is generally proportional to the number of bytes associated with the NK relative to the entire row returned in the intermediate/final result set. Some tests we have performed suggest I/O and CPU savings from 10-50%.

3.) Improves ease of joins between tables since this is typically done via one column

4.) Reduces the probability that the user might construct a "killer query" (product join) because less predicates may be necessary to perform joins. This issue alone can over shadow all of the other issues.

5.) Eases transition to historical continuity - that is to say that one can more easily associate different values of the NK's to SK's. This may be important when any part of the NK may change over time.

6.) Reduces storage in many tables where only the SK is used. Another table of the cross reference between NK's and SK's are maintained and therefore increase storage requirements. Overall, the storage reduction can be significant.

7.) Insolates the impact of change to the columns of the NK. Should any column of the NK change this can be easily performed against a single table rather than the multitudes of tables where the NK may have been used as a primary key or foreign key reference.

The cons -

1.) SK's must be assigned by some process. However, this can often be performed by a two step process within Teradata to maximize performance.

2.) May add a layer of obscurity of now having to use the SK where the business may be accustomed to the NK

3.) May impede performance on a small set of queries where the columns of the NK may provide selectivity for a secondary index on the columns of the NK

With all the above said let me state that some of the issues above are impacted by the need to carry the NK along with SK in all/some of the tables.

I would not suggest that one use SK's as a substitute for each and every primary key in the logical model. SK's bring a lot of maintenance and therefore should generally be applied only when the benefit far outweighs the cost. I may not use a SK to replace a single or small two column NK. It really depends on what can be gained coupled along with other requirements.

I hope that this information will help you get started with the process.

Regards,

Doug Drake


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference