|
Archives of the TeradataForumMessage Posted: Tue, 31 Jan 2006 @ 10:03:34 GMT
You don't seem to get the point why multiset tables are useful or at least you don't seem to discuss it. I'll try one more time to highlight at least two good reasons for the use of multiset tables (others may add more). Reason 1: SET tables have performance overhead which is *very* costly with large tables. If the ETL process can take care of dedupping then the SET overhead is absolutely meaningless. Hence, MULTISET. Not UPI or USI, because UPI are not always possible in the model, and USI, well, they probably should not even be mentioned in this context at all. Reason 2: sometimes duplicates are submitted to the DWH and there could be a reason for that (sometimes even a valid business reason). You want to have the data in you staging area as it is, and sometimes you don't want to bother about surrogate keys, which won't help anyway and which are always costly in terms of maintenance. Hence, MULTISET. The argument that multiset tables introduce an additional risk is irrelevant provided you delegate the responsibility to the ETL and QA. Set tables also introduce risks, don't they? Of course, relational theory teaches us that referential integrity must be handled by the database itself. Great point, of course, but you seem to have been long enough in this business to notice that this point has never been implemented as yet, so we have to move on and search for alternative solutions which work today. Regards, Victor
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | |||||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | |||||||||||||||||||||||||||||||||||||||||||||||||||