Archives of the TeradataForum

Message Posted: Fri, 28 Oct 2005 @ 17:33:18 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Skewed tables and VARCHAR column

From:		McBrideM

I found page 63 of Brian Marshall's book, "The Teradata Database Implementation for Performance", to be quite helpful in this area of addressing skewed data.

To paraphrase the text, you can determine the degree of skew for any given column or group of columns on a table by using the following query:

     select (count(*) (FLOAT)) /
            count(DISTINCT HASHROW (col1, col2...coln)) from TableA;

The close the return value is 1, the less skewed the data (1 essentially means no skew) The higher the value the higher the skew.

Marshall goes on to state that skewed data is always bad, "However, if the table is large and the row distribution. is particularly uneven, possible remedial activity might include:

Changing the Primary Index to a more unique column.

Adding a column to the Primary Index to improve uniqueness.

Changing the data type of one or more Primary Index columns.

I might suggest a fourth alternative would be to add an identity column or some other uniquely assigned surrogate key if no natural unique key exists.

The nice thing about the above sql (query) is that you can plug in different columns or combination of columns on the existing table to move the value closer to 1 thereby providing real analysis for assessing at least the first two of Marshall's suggestion.

Michael E. McBride, MSCIS


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference