Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 24 Jan 2001 @ 17:17:34 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: FastLoad: Discarding Duplicate Rows
 
From:   Michael Larkins

Hi Robin:

At first glance your question makes sense. HOwever, as you look a little deeper, Fastload gets rid of duplicate rows as a way to allow the easiest possible restart.

When duplicate rows are encountered in the sort for Phase 2, it does not know if there was a restart or not. So, if it sees duplicate rows, it "assumes" that they were sent there twice. Once on the initial run and again on the restart run.

This is necessary since the checkpoint is taken at the end of a load block and the error hardly ever happens during a checkpoint, there is an excellent opportunity for some number of rows to be sent to Teradata twice. Thus, no dups allowed.

If you want to load duplicates, than Multiload is your tool. It attaches a sequence number to every row. So, if can tell true dups from dups caused by a restart.

Hope this helps,



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023