Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 21 Feb 2003 @ 21:59:40 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Can it be any faster
 
From:   Fred W Pluebell

Doug Drake wrote:

  3.) Dup row check - related to 1 above, if you have many duplicate values of the PI and the table is Multiset a dup row check will be performed. The higher degree of duplication will cause dup row checks to rise exponentially. You can disable this check by using a SET table but be aware that duplicates will be accepted  



A few corrections on this point:

a) SET tables require dup row check, MULTISET tables allow duplicates

b) FastLoad never loads duplicate rows (even if the table is MULTISET)

c) But FastLoad uses a local pre-sort to eliminate duplicate rows, so even with skewed PI / RowHash distribution, impact of completely duplicate rows is seldom a major factor.

d) On the other hand, logging of rows with duplicate UPI values (but not 100% duplicate content) to the error table can have a major impact.



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023