Archives of the TeradataForum

Message Posted: Tue, 13 Nov 2002 @ 00:36:28 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: 3NF vs Dimensional modelling

From:		Neil Raden

Ken,

This is one of the most misleading, ill-informed and flat-out wrong posts on this subject I've ever seen. First of all, what is a "cube"?? In a Dimensional Model (DM), there are dimension tables and fact tables (there are some others too, but they're incidental), there are no cubes.

I can query the main database directly, that's the whole point. I don't need downstream data marts or cubes.

To me, a "cube" refers to some processed data put in a multidimensional form, such as Essbase or Powerplay, or, for the EDW/CIF crowd, DM data marts. So tell me how this works. If I use an EDW/CIF approach, such as you advocate, and the business changes, don't I have to disassemble and reassemble all of these cubes? In fact, I have to do that in an EDW/CIF data warehouse every single time I add data to that OLTP design (I refuse to call this a 3NF design, because dimensional designs can also be 3NF).

No one queries the main database directly, unless they are in a Teradata environment, and even then, not always. So I think you have this backwards, cube maintenance is a much bigger problem in an OLTP design.

The second point, adding and selling divisions, the DM model handles easily. Take a look at Kimball's Slowly Changing Dimensions methodology.

In fact, the beauty of the DM is its simplicity compared to the tortuous, circular schema in a typical EDW/CIF. It can actually be explained to business people because it reflects the way they think. Take that 300-table OLTP schema and try to validate it with the VP- Marketing. I've tried it. What's more, in a DM DW, there is only one schema. In an EDW, there are at least three: ODS, EDW, DM (and probably multiple versions of these).

You also said in the second point, "...again data can be added/filtered out far more easily when in a relational database." A DM is in a relational database. Are you suggesting that only certain schema are "relational"? I'll say it again, a conformed, multi-fact-table DM with snowflaked (normalized) dimensions in in 3NF. Now Kimball and his followers hate normalized dimensions, so their models are at best 2NF, but there is a large contingent of developers of there, myself included, who normalize the dimensions and hence, deliver a Dimensional Model that is in perfect 3NF. But it still looks like a lot of star schemas. And it has not problem passing the ACID test.

Here's some info - we've begun to use them for real-time update. Hybrid transactional/operational systems. You will never be able to do that with the Inmon/Imhoff method because of its fatal flaw - you have to do all this ugly post-processing to get it to perform, managing "cubes" and other aberrations.

Next point - multiple fact tables. First, if a metric is dimensioned by dimensions A, B, C, D, E and F, then I fail to see how using those keys is a problem. That metric belongs in that table. If we have additional metrics with the same dimensionality (grain), we generally add them to that table, too. If another metric is dimensioned by dimensions A, C, E, F and M, then M will be a new dimension table and the others will be used as is, or "conformed" in Kimball's lingo. We don't make copies of the conformned dimensions, why would we?? Further, that this "replication" is why we would use DM for data marts, not a DW, I fail to see the difference. What is the point of having detailed data if the containers that people can use (data marts) are summarized? The whole idea makes no sense (except to a data modeler, but DW are built for people who use information).

Adding aggregate data: we NEVER put aggregated data in the same fact table as unaggregated data. In fact, we never mix levels of aggregation period. If we create an aggregate table, we maintain it in one of two ways. First, we drop the table and recreate it each time from a GROUP BY of a lower level table. This is the old way, but it still works. Second, ETL tools now provide smart aggregation, so as we insert new rows into the detailed fact tables, the deltas are calculated and updated. It isn't a problem, and since NONE of the smart ETL tools knows how to do this with an OLTP schema, you're stuck.

I agree completely with your next-to-last point, but again, this is something that can be done easily in a DM. New external demographic attributes? All we do is add those attributes to a single table (say, Customer) and we're done. Where did you ever get the idea this was difficult?

The last point, dunno, you lost me. Are you implying that people who use DM's underestimate business analysts? I hadn't heard that before.

-Nr

Neil Raden
Hired Brains, Inc.
www.hiredbrains.com


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference