Archives of the TeradataForum
Message Posted: Sat, 09 Nov 2002 @ 17:18:21 GMT
Michael McBride wrote:
If you admit to being bigoted, why not keep those opinions to yourself? Where I grew up, bigotry was considered something to avoid, not be proud of.
I think it's fair to make that claim, but I don't think it's valid. True, Kimball came out of Metaphor, where he learned about the star schema and ROLAP, then started Red Brick which, by the way, was one fine piece of technology. I have clients still using it, years after the company vanished inside Informix and then IBM. And Ralph also tied up with Sagent for a while, clearly a company in the DM camp.
But you should give the thousands of us out there who use this stuff a little credit - we (or at least many of us, including me) profit not at all. In fact, if I went the CIF route, I'd make a helluva lot more money in my consulting practice, because it would take millions to do what we do on hundreds of thousands.
And think this one over - don't you think there are influencial people who have been profitting for years selling their own and/or endorsed products by pitching a 3NF, CIF architecture? Start at the top of the list of DW luminaries and work your way down.
Let me tell you what my ulterior motive is - to help my clients be successful by designing and implementing the best solutions. I've been at this for 20 years, and I know how to design any kind of model. I do what works.
Please show me a benchmark. And while you're at it, consider this: a DM with snowflaked dimensions IS IN THIRD NORMAL FORM!!! So when you start doing that 3NF vs DM thing, try to keep that in mind, because it's a non sequiter.
Also, "get into the DW faster" has to include the objects that people query. Also, the performance hit they take on each query because of views, lack of join indexes, etc.
Totally wrong, misleading and unsubstantiated. I've been doing these benchmarks for years and I can assure you the opposite is true. Besides, rarely do people actually get to query these beasts, so what exactly do you mean "perform"?
I don't even know where to start here. What is the correct business model? In a data warehouse, it's the one that supports the reporting and analysis needs. All of those circular relationships in a transactional are useful for capturing the atomic transactions of an operational system, but when you integrate data from multiple sources, those models are irrelevant. Dimensional models are the ones that reflect the business FROM THE ANALYSTS POINT OF VIEW. And the reason that no one can query these schema effectively is because they lack compound indexes and have too many tables. It isn't philosophy, it isn't rocket science, it just works.
The query optimizers in most databases simply can't handle analytical queries against a 3NF schema, they never could, that's we started using a DM in first place. Wouldn't it be a wonderful world if we could use one schema? And to make matters worse, all of the database vendors other than Teradata (Oracle, Microsoft, IBM UDB, IBM Informix) have been working hard for the last five to improve their optimizer technology FOR DM SCHEMA. So I don't expect to see any big improvement query performance against 3NF schema anytime soon in those products.
The 3NF model can't support intra-day reporting because it needs the ODS (or I should say, the CIF architecture does, of course you could insert real-time records into a 3NF design). The ODS stinks. It's a bad idea. Can you insert intra-day records into a 3NF schema? Sure, but you still can't query it, you have to move it out to data marts, and that's what's wrong with the whole idea.
My guess is you replaced some poorly-implemented ones, lord knows there are plenty of those out there. Plus, re-do's always involve new upgraded hardware, etc., so I'm not sure this an apples-to-apples observation. And data availability is a tricky issue. Just having data in a database doesn't make it available. Every 3NF EDW I've seen relied on views, aggregating and subsetting the data into data marts for queries, and only batch queries for the SAS rocket scientists in the basement were allowed against the main DB.
Again, I know Teradata resolves queries against OLTP schema more efficiently than other databases, so we can make an exception there.
Your bigotry is manifest. Consider that there may be some artistry in DM modeling, too. It's also easier to mask all sorts of things in a design with 600 tables. Do you have any idea why Codd used the phrase "normalization?" If you don't, the answer is going to disturb you.
Actually, I think it's only as good as the customers think it is, and they want availability, completeness, understandabilty and performance. Im my practice, I find the overhead of a dead database too much to deal with.
I don't love it, I find it frustrating. And I also don't like the characterization that I have some religious attachment to the idea. I don't. It just works. But when I talk to 3NF'rs, inevitably they appeal to some faith that this is the only true path and you gotta have faith.
Well, I don't. normalization was a technique to get relational data bases to perform as OLTP platforms, because they were too slow. When Codd devised what we now call normalization, he didn't have a word for it.
It was 1973. And you know what he said? He said, "Nixon is normalizing relations with China, so I guess that's what I'm doing, normalizing my relations."
So there you have it, the NORMAL form is an expression that was totally arbitrary. He could have called it Third Chickenfat Form.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|