Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 20 Feb 2004 @ 15:27:41 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Break even point for Multi value compression
 
From:   Glen Blood

-----------------------
Admin Comment: The attachment for this post can be found in the Attachment area of the TeradataForum website:

www.teradataforum.com/attach.htm

-----------------------


Yes.

Colby Guilbeau (NCR) did quite a bit of work, I built on his work and built some macros and stored procedures to make it work.

Right now it does not handle:

Is there a benefit to converting VARCHAR to CHAR and compressing the CHAR column.

Have the demographics changed over time sufificiently to warrent a reanalysis.

Automatically select the best compression.


BAsically,

Using a stored procedure, I build a table with all of the values for a column in a table and how many records have that value. The clals to the sotred procedure are generated in a macro. This is the only piece that takes time.

We then use the results in this table to find out what the compression at each of the break points would save us. The breakpoints are 2**n -1 where n in {1,2,.8}. We don't look between the breakpoints (unless there are no values), because it makes no sense to compress 2 values, if you have 3 to compress. Each level requires one additional bit for the compression.

The basic equation is:

SAVINGS in bytes = number of records compressed * COLUMNLENGTH - number of columns *n/8


where n in {1...8}

We then look at the difference in savings between level n and level n-1 to pick out the level of compression we want. We usually put this into a spreadsheet for easy viewing.

We then execute another macro to get the compression list from our value tables.

If desired, we could probably provide the instructions, table structures, and macro and stored procedure definitions in the Forum libnrary, with the understanding that this is a work in progress.

Glen



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023