|
Archives of the TeradataForumMessage Posted: Mon, 30 Jan 2006 @ 22:06:35 GMT
<-- Anonymously Posted: Monday, January 30, 2006 16:38 -->
Teradata embraces the concept of NUPIs to maximize parallel processing and minimize skewing by distributing the data to all Vprocs in a manner that will be efficient for joins. "Keep like data close to each other in the same AMP, grasshopper" they advise in all the classes NCR imparts. Unfortunately, if a table is defined with a NUPI by definition it will not have a Primary Key (Unique Identifier); therefore, a USI is created to "emulate the "PK" and help implement "RI" through explicit or implicit constraints. On the other hand, choosing a UPI to implement the "PK" and avoid a USI may result in data that is not distributed for most efficient processing. Bad karma may result from choosing UPIs over NUPIs. Now, choosing a NUPI for data distribution and a USI for "PK" implementation creates three fundamental problems in Teradata: 1. The kiss of death from USI overhead on mass operations on tables 2. Waste of space to implement a PK 3. Waste of resources to maintain the PK Before you (or any other reader) doubts this too much, try this experiment. 1. Create a table with 1,000,000 rows and a NUPI. 2. Update all rows in that table. Note CPU, Disk, and wall clock time 3. Delete all rows from the table. Note CPU, Disk, and wall clock times 4. Repeat step 1 5. Add a USI to the table from step 4 6. Repeat steps 2 and 3 In my system I got roughly a 300 to 1 ratio between the two sets of (mass delete/update) tests. I recommend trying it before knocking it. "In God we trust, all others must have data" anonym
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | |||||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | |||||||||||||||||||||||||||||||||||||||||||||||||||