|
|
Archives of the TeradataForum
Message Posted: Fri, 17 Aug 2007 @ 09:45:52 GMT
Subj: | | Re: Working with duplicate rows |
|
From: | | Victor Sokovin |
| Is there a BUSINESS reason to care about the duplicate rows? Multiset tables are worse than useless, unless you're trying to avoid the
duplicate row check on a load (thereby saving load time). And designing for LOAD is usually a bad business decision. But Teradata had to add it
so that people could have their duplicate rows, in order to feel more secure at night (personally, I prefer a Teddy Bear. Or my wife.) We have a
20+ terabyte warehouse and nary a multiset table in the bunch. | |
Thanks. You are probably not using interface tables for external data providers. Some of them "like" to insert duplicates so you know you just
have to let them do it and later decide what their duplicates actually mean and what you need to do with them. and it is not always just deduping.
Difficult, yes, but c'est la vie.
I know it's a bit of a beaten track but I'll nevertheless mention that I like MULTISET tables. Nothing wrong with them really, especially if
you are coming to Teradata with other RDBMS experience (I am not aware of any other RDBMS with a concept of SET tables - let me know if you do!)
and know how to handle duplicates. The big advantage is that *you* can decide what, when and how you check, which, with a bit of luck, saves a lot
of CPU effort.
Regards,
Victor
| |