Archives of the TeradataForum
Message Posted: Tue, 30 Mar 2004 @ 14:46:47 GMT
Subj: | | Re: Teradata Peformance issues when loading data to Teradata using ETL tools like Informatica or Data Stage |
|
From: | | Howard, Jack W |
Three things strike me:
1) ETL tools do not absolve you from understanding the attributes of the load method. Each tool has a documented set of strengths
and limitations. I'd start the topic as "programmer proficiency using ETL tools" then move to Teradata or ETL tool performance.
2) We have generally experienced excellent throughput with Informatica 6.1 & Teradata V2R5 because we carefully consider the above
item. The performance is quite predictable; when it falls outside of the predicted range we know we need to look at something like the addition
of an index to support a lookup, etc.
3) The Teradata Solaris ODBC driver reminds me of a circa 1996 ODBC driver - ODBC moved from an idea to a viable approach then
because the specification provided for an improved method of bulk loading. The principle evidence that Solaris ODBC is inefficient is that we
average 10K rows/minute using ODBC and about 100K rows/minute using Tpump. Since both ODBC and Tpump are statement based loading methods we
should not see such a wide divergence of performance. A well written ODBC interface should be within 90% of the performance of Tpump.
ETL tools provide increased functionality that is completely dependent on a quality ODBC interface. In our environment we'd love to use
Informatica's constraint based loading feature more often, but the pathetic ODBC driver makes it impossible. Constraint based loading means the
tool resolves loading sequence requirements imposed by referential integrity - handy and exactly why you want to use an ETL tool.
|