Archives of the TeradataForum

Message Posted: Wed, 26 Jan 2011 @ 16:22:57 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Compressing infrequent values

From:		Attila_Finta

Hi all. I have a simple question around value compression: is it better to compress all known values or only the most frequent values?

Simple example: Currency Code in a transactional table has about 50+ distinct values, but only 9 occur > 1%. All possible values for the column are known, and fewer than 255 exist, but most values occur less than 1% of the time.

     Col_Val Val_Pct
     ------- -------
     USD       59.1%
     EUR        7.6%
     CNY        6.3%
     JPY        5.0%
     BRL        3.7%
     GBP        3.3%
     CAD        2.9%
     AUD        2.6%
     INR        2.5%

I can add the other 40 values to the value list. But would that create greater operational efficiency or less? I suppose the simple answer is: a little less efficient load and a little more efficient retrieval. In general we want to optimize for retrieval. Therefore do you advise that we add all possible values? Or only the most frequent values?

Thanks.

Attila Finta