Archives of the TeradataForum

Message Posted: Wed, 24 Jan 2001 @ 17:17:34 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: FastLoad: Discarding Duplicate Rows

From:		Michael Larkins

Hi Robin:

At first glance your question makes sense. HOwever, as you look a little deeper, Fastload gets rid of duplicate rows as a way to allow the easiest possible restart.

When duplicate rows are encountered in the sort for Phase 2, it does not know if there was a restart or not. So, if it sees duplicate rows, it "assumes" that they were sent there twice. Once on the initial run and again on the restart run.

This is necessary since the checkpoint is taken at the end of a load block and the error hardly ever happens during a checkpoint, there is an excellent opportunity for some number of rows to be sent to Teradata twice. Thus, no dups allowed.

If you want to load duplicates, than Multiload is your tool. It attaches a sequence number to every row. So, if can tell true dups from dups caused by a restart.

Hope this helps,


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Archives

2016		2007
2015		2006
2014		2005
2013		2004
2012		2003
2011		2002
2010		2001
2009		2000
2008		1999

2001 Indexes

Jan		Jul
Feb		Aug
Mar		Sep
Apr		Oct
May		Nov
Jun		Dec

Last Modified: 15 Jun 2023