Archives of the TeradataForum

Message Posted: Thu, 03 Jun 2010 @ 22:56:18 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Dummy or Null Dimension Key

From:		Anomy Anom

<-- Anonymously Posted: Thursday, June 03, 2010 18:54 -->

Hi Srikanth,

I forwarded your question to one of the Teradata subject matter experts. Please see her response below.

There are a couple of things here.

First, having nulls in the key of any table is not good database design, and certainly not a recommended approach.

All rows with null (or a dummy value) as a PI will hash to the same AMP, so that can create a data distribution imbalance if there are lot of such rows, whether nulls or a dummy value is used.

Also, if they ever change from a null or dummy value in the PI to something else, that will cause extra overhead because (called a complex join) because a row will have to be deleted and then inserted, instead of a simple update happening.

However, any join involving this dimension table will filter out the rows with null in the join column, so those rows with nulls will not participate in the join, which could be helpful to them. Those rows will be dropped out before the join happens, so that could make nulls a slightly better choice than a dummy value in the case of joins.

Whether or not there will be massive skew when there is a join between the fact table and the dimension table will depend on how many rows carry the same value (null or dummy) and also how many rows from the dimension table remain after single-table selection is applied. If there are a small percent of rows left, then performance issues will be a lot less than if the entire dimension table is eligible for the join.

These are just a few impressions based on limited information.

Hope this information helps.

Thanks.


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference