Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Sat, 09 Jul 2005 @ 08:33:29 GMT

  <Prev Next>   <<First <Prev

Subj:   Re: Primary Index selection on new tables
From:   Matthew Winter


Based on what you said about the dimensional data and that they are all less than 40 values, with a very high skew, I would recommend that you denormalise the dimensional data onto the MKT_SUMMARY fact table and then use the multi-value compression feature on all of the dimensional columns to elimate the additional storage requirements.

I would also then partition the table on the month_end column to maximise query performance.

As for the final PI value, I would add an additional sequence number to ensure no skewing of the data.

We have done the above many times and found this to be a much better option when the table is highly skewed like what you have mentioned.

Also as an aside. The design below I am guessing has been done to try and utilise the Star-Join optimisation within Teradata. Which as you all know performs a cartesian-product join of the dimensional tables before joining it to the main fact table. In our data we have found that number of combinations a cartesian-product join would generate is far in excess of the actual combinations used. So to improve the overall performance and minimise the spool requirements we have performed the optimisation ourselves as part of the load process, merging all of the dimensional tables together into a single reference/dimensional table, but based on the actual combinations used, then assigning a surrogate key to the table. This key then replaces all of the individual surrogate keys on the base fact table.

Anyway. Hope some of the above helps.

Matthew Winter

  <Prev Next>   <<First <Prev
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023