Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 08 Jul 2005 @ 20:45:54 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Primary Index selection on new tables
 
From:   Duell, Bob

I have a pretty basic question (well, basic anyway. I'm not very pretty).

What's the best way to create a table? Suppose I have a table with several "dimension" columns and several "fact" columns, such that any combination of the "dimension" columns uniquely selects one row. I "want" to define the primary index as a composite (all the "dimensions"), but I'm not sure that's best. I read somewhere that a "surrogate" key would be better, supported by multiple secondary indexes (NUSI).

In this particular case, I want to optimize query performance; I don't care about loading. My project involves loading the result table once a month. But, would that make a difference in which technique to use?

For example, I tried something like this:

     create multiset table MKT_SUMMARY
       , no fallback
       , no before journal
       , no after journal
       , checksum = default
            (  market      SMALLINT
             , channel_id  SMALLINT
             , payment_id  SMALLINT
             , status_id   SMALLINT
             , month_end   DATE
             , fact1       INTEGER
             , fact2       DECIMAL(15,2)
             , fact3       DECIMAL(15,2)
             , fact4       DECIMAL(15,2)
             , fact5       DECIMAL(15,2)
            )
       unique primary index
           ( market_id, channel_id, payment_id, status_id,  month_end )

When I built and loaded this table, the distribution was not very good. I read about hash bucketing, so I think I understand why (a very large proportion of the rows is in one market; there are 40 different markets. Today, there are only two values for month_end, but over time many months will accumulate).

I suppose I could generate a random number as a surrogate key during each load, but does it need to unique within the table or do you create a non-unique primary index and make the surrogate unique within each load. The example has 5 "dimensions"; my real application has 15.

I'd be happy to read any technical documentation or publications on the subject, so if you have a reference, let me know.


Thanks!

Bob Duell
Manager, Systems and Integration
Cingular Wireless; CRM Database Marketing



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023