Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 17 Aug 2007 @ 09:45:52 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Working with duplicate rows
 
From:   Victor Sokovin

  Is there a BUSINESS reason to care about the duplicate rows? Multiset tables are worse than useless, unless you're trying to avoid the duplicate row check on a load (thereby saving load time). And designing for LOAD is usually a bad business decision. But Teradata had to add it so that people could have their duplicate rows, in order to feel more secure at night (personally, I prefer a Teddy Bear. Or my wife.) We have a 20+ terabyte warehouse and nary a multiset table in the bunch.  


  Ok, rant over.  


Thanks. You are probably not using interface tables for external data providers. Some of them "like" to insert duplicates so you know you just have to let them do it and later decide what their duplicates actually mean and what you need to do with them. and it is not always just deduping. Difficult, yes, but c'est la vie.

I know it's a bit of a beaten track but I'll nevertheless mention that I like MULTISET tables. Nothing wrong with them really, especially if you are coming to Teradata with other RDBMS experience (I am not aware of any other RDBMS with a concept of SET tables - let me know if you do!) and know how to handle duplicates. The big advantage is that *you* can decide what, when and how you check, which, with a bit of luck, saves a lot of CPU effort.


Regards,

Victor



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023