Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 19 Apr 2006 @ 14:47:56 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Fastload performance depends on size of
 
From:   Christian Schiefer, makeITdone

Hi,

I am currently facing a interesting situation:

1.) There is a conversion program preparing data to convert it to an fastload readable record. It is written in C and it takes a few seconds to convert data of on day

2.) The C-program is writing each record ( about 300 bytes ) with an "fprintf"

3.) Output of this program is redirected into a fastload via a pipe

4.) Performance very poor


If I am using an access module in fastload, which handles gzip-files and gzip the source data before, I have about 400% performance improvement.

Sounds strange, I have my data already, but then I gzip to a unix-pipe and can load it 4 times faster as without it.

I suspect, that fastload wants 1MB parcels of data to be loaded in one go - which is provided by the access module and it is performing nicely.

cat SOURCE | convert | fastload

vs.

cat SOURCE | convert | gzip | fastload with gzip-access module

is in a relation of 4:1.

What the hell is fastload doing, if you deliver only a couple of bytes at a time via a pipe ??? In my mind wasting a lot of time...

Has the C-program, which is doing the conversion , to deliver data-parcels with 1MB instead of 1 record ( 300 bytes ) after another ..

Any ideas ?


Feedback welcome

Christian

--------------------------------
makeITdone IT Services
D.I. Christian Schiefer


P.S: I know how to fix this, but I would really like to know how fastload is treating its input data ..



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023