Archives of the TeradataForum
Message Posted: Tue, 09 Aug 2005 @ 09:39:50 GMT
Subj: | | Re: Experiences with ETL Tools |
|
From: | | Victor Sokovin |
| I'd like feedback from the people who actually had to implement the tool, that is, technical people. And if possible, let me refine my
search to those that are experts using the Teradata Load Tools (i.e. fastload, multiload, tpump, fastexport, bteq). | |
Good set of questions. Your preamb did seem to touch a few social layers which might be interesting to discuss one day because they seem to be
as important on a DWH project as the technical issues. But let's stay techie today.
| A) Is it true that ETL tools actually meet all the ETL development needs. | |
This is a matter of definitions. To qualify a candidate ETL tool should offer you a couple of ways of extracting data from a number of
supported sources. Flat files and a few popular databases should be supported at the minimum. In the transformation department the tool should
provide for filtering of data, basic aggregations and (de)normalizations. The loading part should support native bulk utilities. Usually they
throw in the ODBC support as well. A user is not supposed to know much about the internal working of the tool and underlying databases to design
simple ETL jobs.
If the tool has these features it may hit the market under the name of "ETL tool". Of course, what the market finds of the newcomer is another
question. The market seems to be saturated, it is the buyers' market, so some picky customers might expect scheduling and monitoring modules as
well as integrated data analysis functionality thrown in as well. Others might be interested in advanced performance management features.
| B) In case the answer to the previous question is NO, do they help with the difficult transformations ? | |
What is your definition of a "difficult transformation"? If it involves a lot of procedural logic on the data, then none of the tools I am
aware of will provide it on the "button clicking" level. Most of the tools will allow you to integrate such transformations if you write them in
their scripting language (if they have it) or any external procedural language. Of course the level of "integration" may accordingly vary. You
cannot expect the same level of debugging and job management comfort as with the "button level" features.
A typical "bulk" transformation can be split into a number of simple transformations I mentioned above (and perhaps a few I forget to mention).
That is guaranteed in principle by the underlying mathematics. But while evaluating ETL tools you might ask yourself how much time you as user
need to spend on implementing your average "complex transformation". The effort may vary between the tools.
| C) ETL tools actually use the parallel features of Teradata extensively? | |
Most of the tools on the market were initially designed with other vendors in mind. They would at best be able to pass on your request to TD
but the optimization is in your hands. Check whether the tool does not actually stand in the way of such optimization.
Others use the "thick tier" approach: they tend to thicken the tier where the tool resides by doing some operations which you would normally do
on the database level. For example, if you need a GROUP BY kind of transformation, the tool might just suck the raw data in its own memory and
perform the aggregation there. If the tool does that I for one would hesitate to stick a "uses the parallel features of Teradata extensively"
label on it.
Regards,
Victor
|