Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 09 Aug 2005 @ 09:39:50 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Experiences with ETL Tools
 
From:   Victor Sokovin

  I'd like feedback from the people who actually had to implement the tool, that is, technical people. And if possible, let me refine my search to those that are experts using the Teradata Load Tools (i.e. fastload, multiload, tpump, fastexport, bteq).  


Good set of questions. Your preamb did seem to touch a few social layers which might be interesting to discuss one day because they seem to be as important on a DWH project as the technical issues. But let's stay techie today.


  A) Is it true that ETL tools actually meet all the ETL development needs.  


This is a matter of definitions. To qualify a candidate ETL tool should offer you a couple of ways of extracting data from a number of supported sources. Flat files and a few popular databases should be supported at the minimum. In the transformation department the tool should provide for filtering of data, basic aggregations and (de)normalizations. The loading part should support native bulk utilities. Usually they throw in the ODBC support as well. A user is not supposed to know much about the internal working of the tool and underlying databases to design simple ETL jobs.

If the tool has these features it may hit the market under the name of "ETL tool". Of course, what the market finds of the newcomer is another question. The market seems to be saturated, it is the buyers' market, so some picky customers might expect scheduling and monitoring modules as well as integrated data analysis functionality thrown in as well. Others might be interested in advanced performance management features.


  B) In case the answer to the previous question is NO, do they help with the difficult transformations ?  


What is your definition of a "difficult transformation"? If it involves a lot of procedural logic on the data, then none of the tools I am aware of will provide it on the "button clicking" level. Most of the tools will allow you to integrate such transformations if you write them in their scripting language (if they have it) or any external procedural language. Of course the level of "integration" may accordingly vary. You cannot expect the same level of debugging and job management comfort as with the "button level" features.

A typical "bulk" transformation can be split into a number of simple transformations I mentioned above (and perhaps a few I forget to mention). That is guaranteed in principle by the underlying mathematics. But while evaluating ETL tools you might ask yourself how much time you as user need to spend on implementing your average "complex transformation". The effort may vary between the tools.


  C) ETL tools actually use the parallel features of Teradata extensively?  


Most of the tools on the market were initially designed with other vendors in mind. They would at best be able to pass on your request to TD but the optimization is in your hands. Check whether the tool does not actually stand in the way of such optimization.

Others use the "thick tier" approach: they tend to thicken the tier where the tool resides by doing some operations which you would normally do on the database level. For example, if you need a GROUP BY kind of transformation, the tool might just suck the raw data in its own memory and perform the aggregation there. If the tool does that I for one would hesitate to stick a "uses the parallel features of Teradata extensively" label on it.


Regards,

Victor



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023