Archives of the TeradataForum

Message Posted: Fri, 28 Apr 2006 @ 22:12:35 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Question about UPI Vs NUPI

From:		Walter, Todd

If you use a Set table then while loading it has to do the dup checking to make sure the entire record is unique. Time consuming. If you go with multiset table and a UPI then it would just check for the PI to be Unique.

This seems to be a confusing and poorly understood topic...

IF there is a UPI, then no duplicate row checking is every required or performed by Teradata. Only the UPI value of a new row needs to be checked against other existing UPI values and only within the bounds of the row ash to which to new row hashes. Thus MULTISET is never required and is in fact redundant on a table with a UPI.

IF there is a NUPI and the table is a SET table, then Teradata must compare the full contents of the row against other rows with the same NUPI values in order to enforce the unique row constraint. IFF there are a LOT of rows with the same NUPI value, then this process can get expensive. a "lot" is a little fuzzy but the rule of thumb is that there will be more duplicates than fit in a few data blocks. Take your row size and divide it into the average block size for the table (use 48K if you don't know) and multiply times 3 to get a rough boundary to use. For narrow rows this will be a couple thousand duplicates, for wide rows it can be substantially less. If your average NUPI value has a frequency greater than that rule of thumb of if you have some really skewed NUPI values that are way over that ROT even though the everage is small, then MULTISET would be indicated.

The price of MULTISET is that the system allows duplicate records. Fine if that is what you intend, but if that is not intended behavior then you need to monitor for data problems that can creep in because of restarted/rerun loads, broken ETL processes,... which would otherwise be caught by the dup row check.


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference