Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 26 Feb 2001 @ 22:58:36 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Why not use star schema
 
From:   Eric J. Kohut

Jeff,

You'll have to be a little more specific of your definition of Star Schema.

Some people differentiate between Star and Snowflake. A Star requires all of the levels of each dimensional hierarchy stored in the base (FACT) table and a Snowflake requires only the lowest level of each dimension hierarchy stored in the appropriate base (FACT) table.

Depending on the level of data used for analysis, many times a Snowflake is equivalent to a 3rd Normal form for certain industries. For example for the Day, Item, Location level of detail data in a Retail Customer, the base 3rd Normal Form table is equivalent to the Snowflake table. However, when that retailer wants to start doing product or customer transaction analysis, that table becomes a summary table of the transactions in the TLOG based transaction Header, Detail, Method of Payment, etc... other tables which define the detail transactions.

I think that for the most part we consultants at Teradata see 4 basic options for physical design. A pure STAR, Snoflake, Physical Denormalization 3rd Normal Form, and Pure 3rd Normal form. However, you might want to think of this as more of a continuum, so that every denormalization is understood and justified.

I don't think we ever agree that a Pure STAR should be implemented except through a view. Although it can be and could perform extremely well, both as a view and as a physical table. A view will save you a lot of space and a physical table will not always out perform a view. If fact based upon some if not most workloads, I can prove that a view will often outperform a prejoined physical table. In order to reverse this statement, you may have to build a lot of secondary indexes which would be unneeded if you used a view in the first place. As a result, you can often save space and provide equal or better response time. This proof is not something that can be explained in this memo.

I think we consultants try to keep your database as close to 3rd Normal Form to minimize the complex cross table updates and the duplication of data issues which can be pretty complex and costly. As a result, Snowflake and Physical De-norms are mostly for tuning purposes or to support a specific tools set.

The 3rd Normal Form design presents the least issues in the long run since, as Jim stated, it is the most extensible and easily enhanced as long as it meets your performance criteria.

The alternative, is a 3rd Normal Form design with a larger system. Remember there are design trade-offs and these denormalizations will bite you later as you try to extend / enhance your Warehouse's capabilities. We know this from 15 + years of experience with long term customers. As a result, we recommend the best long term strategy. You the customer must make these design trade-offs based upon budget, future growth timelines, amount of application development resources, etc.. and then live with the lasting effects of your decisions or the re-work involved to redesign and then change your applications for the new model as you progress.

My opinion only, everyone has one,

Good Luck,

Eric

EJK
Eric J. Kohut
Senior Solutions Consultant - Teradata Solutions Group - Retail
Certified Teradata Master
NCR Corp.



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023