Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 02 Oct 2007 @ 20:13:03 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Float data type compressions
 
From:   william.gregg

Assuming the table is large enuff to merit MVC (those large tables that together account for 95% of the space on a DB are worthy of consideration), run a select and group by query on each of the columns under consideration, collecting the TOP N (below 256) values that each contribute at least 1/2 percent to the total number of rows in the table. (I like the latter approach; U may miss a few flat histogram columns, but this an 80-20 type of task.) Also, avoid volatile columns or columns with several decimal places of precision (such as a measurement of exchange rates to 6 significant figures).

Apply the algorithm in the manual (take into account the cost of additional bytes and bits). Be aware of the Table header constraint and the limitation on the number of characters (~8K) that may be compressed for one column.

If the analysis of potential savings is a close call, do not compress. If MVC does not save at least 15% of a table's volume, it is not worth taking the time to implement -- IMHO.

Example (u have the table-size in rows available upfront or u table-ize it)

-- Table has 100M rows

     Select col_16, count(*)
     ...
     Group by col_16
     Having count(*) > table-size-in-rows / 200  (in this case 50K rows)

(the query returns the values in a column that account for at least .5% of the rows.)

U can further refine this query by using PERCENT_RANK() over ( partition by etc).

Your thought about standard values such as zero, 1.0 and -1.0 often proves to be the case for compressing numeric values.

Bottom line, go for the low hanging fruit; do it once a year or so.


Good Luck,

Bill Gregg
Thomson Healthcare
Teradata Team



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023