Archives of the TeradataForum

Message Posted: Mon, 30 Jan 2006 @ 20:27:05 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Erwin is not mi amigo again...

From:		Dieter Noeth

Anomy.Anom wrote:

USI's are the KISS OF DEATH. They make up for the lack of a PK

UPI *are* implemented to support the PK.

Is there any DBMS which supports a PK without implementing it as a Unique Index? Would be nice to know how it's done (fast and efficient)...

at a very high price of dozens and even 100s times more cost to do processing.

Really 100 times?

I don't use them at all. NO USIS or MULTISETS allowed in my Datawarehouse. Nein!

At least for stage tables MULTISETS are quite useful (keyword "cat food argument").

I started working with RDBMS since 1989 and have not had a need to have duplicate rows in any of my database designs table. I used INGRES, SYBASE, ORACLE, DB2, and now TERADATA. Multisets can get one into a pickle of data corruption.

Full ACK.

That's why the Relational Model proposed set tables :-)

I have seen my fair share of databases w/o data models riddled with data corruptions and RI problems. No one has a blueprint to properly maintain the database schema, so they add and remove columns at will.

It's bad enough TD does not have the concept of a PK that does not cost a first born to use,

Again, how is RI implemented (fast and efficient) without a kind of Unique Index?

so now enter Multisets as another tool in the arsenal to make the disaster happen faster and get one into the pickle of data integrity problems.

At least you get a choice in Teradata. They implemented (relational) SET tables first (with an implicit PK across all columns) and added MULTISET later to support Standard SQL.

E.g. in Oracle and MS SQL Server the default is a Multiset table and you always have to add a PK/Unique Index. They don't even know about the concept of a set table.

Most MULTISETS can be eliminated by using due diligence. I have a "user" with a 2B row table and an USI that took 2 hrs to update. I asked them to get rid of the USI but the duplicate check on the PI still made it last almost 1 hour. Their solution was to go MULTISET to avoid the dup check but what to do about RI?????

Even if the table is a SET table, there still might be a violation of the PK without a Unique Index, because the PK is usually not across all columns.

I suggested a PPI and now it takes 15 minutes and many of their queries take advantage of dynamic partition elimination. A little outside the box thinking went a long way.

How does a PPI help in maintaining RI? It might help in a set table if there are less dup checks, because the new 10-byte RowKey tends to be more unique than the old 8-byte RowId.

If you must use a MULTISET use them with plenty of caution.

Full ACK

Dieter


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference