|
Archives of the TeradataForumMessage Posted: Fri, 28 Oct 2005 @ 17:33:18 GMT
I found page 63 of Brian Marshall's book, "The Teradata Database Implementation for Performance", to be quite helpful in this area of addressing skewed data. To paraphrase the text, you can determine the degree of skew for any given column or group of columns on a table by using the following query: select (count(*) (FLOAT)) / count(DISTINCT HASHROW (col1, col2...coln)) from TableA; The close the return value is 1, the less skewed the data (1 essentially means no skew) The higher the value the higher the skew. Marshall goes on to state that skewed data is always bad, "However, if the table is large and the row distribution. is particularly uneven, possible remedial activity might include: Changing the Primary Index to a more unique column. Adding a column to the Primary Index to improve uniqueness. Changing the data type of one or more Primary Index columns. I might suggest a fourth alternative would be to add an identity column or some other uniquely assigned surrogate key if no natural unique key exists. The nice thing about the above sql (query) is that you can plug in different columns or combination of columns on the existing table to move the value closer to 1 thereby providing real analysis for assessing at least the first two of Marshall's suggestion. Michael E. McBride, MSCIS
| ||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | ||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | ||||||||||||||||||||||||||||||||||||||||||||||||