Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 26 Jan 2011 @ 16:22:57 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Compressing infrequent values
 
From:   Attila_Finta

Hi all. I have a simple question around value compression: is it better to compress all known values or only the most frequent values?

Simple example: Currency Code in a transactional table has about 50+ distinct values, but only 9 occur > 1%. All possible values for the column are known, and fewer than 255 exist, but most values occur less than 1% of the time.

     Col_Val Val_Pct
     ------- -------
     USD       59.1%
     EUR        7.6%
     CNY        6.3%
     JPY        5.0%
     BRL        3.7%
     GBP        3.3%
     CAD        2.9%
     AUD        2.6%
     INR        2.5%

I can add the other 40 values to the value list. But would that create greater operational efficiency or less? I suppose the simple answer is: a little less efficient load and a little more efficient retrieval. In general we want to optimize for retrieval. Therefore do you advise that we add all possible values? Or only the most frequent values?


Thanks.

Attila Finta



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023