Archives of the TeradataForum
Message Posted: Thu, 03 Jun 2010 @ 22:56:18 GMT
<-- Anonymously Posted: Thursday, June 03, 2010 18:54 -->
I forwarded your question to one of the Teradata subject matter experts. Please see her response below.
There are a couple of things here.
First, having nulls in the key of any table is not good database design, and certainly not a recommended approach.
All rows with null (or a dummy value) as a PI will hash to the same AMP, so that can create a data distribution imbalance if there are lot of such rows, whether nulls or a dummy value is used.
Also, if they ever change from a null or dummy value in the PI to something else, that will cause extra overhead because (called a complex join) because a row will have to be deleted and then inserted, instead of a simple update happening.
However, any join involving this dimension table will filter out the rows with null in the join column, so those rows with nulls will not participate in the join, which could be helpful to them. Those rows will be dropped out before the join happens, so that could make nulls a slightly better choice than a dummy value in the case of joins.
Whether or not there will be massive skew when there is a join between the fact table and the dimension table will depend on how many rows carry the same value (null or dummy) and also how many rows from the dimension table remain after single-table selection is applied. If there are a small percent of rows left, then performance issues will be a lot less than if the entire dimension table is eligible for the join.
These are just a few impressions based on limited information.
Hope this information helps.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 23 Jun 2019|