Archives of the TeradataForum
Message Posted: Wed, 19 Jun 2002 @ 17:28:50 GMT
Subj: | | Re: Testing Methodology |
|
From: | | Jose Lora |
Hi Thomas.
Some of your reasons to move ETL away from the Teradata system are not really valid, for example :
| - batch windows get smaller as the value of the warehouse grows | |
It always possible and recommendable to design your data warehouse without having Batch Windows as a requirement. In fact, NCR is moving
in that direction with the Active Data Warehouse concept and the improvements in the loading tools. Your design will also be affected for
user's Data availability requirements.
| - at the same time data volumes and business logic complexity grows | |
This reason will affect both ETL strategies and I should say that an out-of-Teradata strategy will be more affected because more business
logic could mean more lockup transformations and bigger data volumes will involve expensive non parallel IO to move the data required for
the transformation outside Teradata
| - I purchased the database for the user and not for the ETL process | |
This is a good point and the best solution is making the user a part of the ETL process and making him understand the benefits of this
approach. In my experience, user are very happy to have access to the original data in the database (temporarily) to validate if the
changing business rules.
| - a separate ETL server is cheaper to expand than a Teradata system | |
That's true, however, an increase in CPU power for your ETL process will also increase the CPU power for the business process. In the
other hand, having an ETL server that only work during batch hours is not a very good way to use that box and do not improve user response
time.
| - there are high performance tools that transform extremely fast, moving more data than I have typically seen | |
Sure there are, an after test some of them I couldn't find a match for the inherent parallelism of Teradata (for large data volumes)
However, there is always a good reason to make transformation outside Teradata (number or date validation are not easy on Teradata) and
in this case, I would prefer using Inmod routines, Access Modules or tools like Genio (Hummingbird) that can stream data directly to
Fastload (doing all sort of simple row based transformations on the fly), to avoid the expensive non parallel IO.
Regards.
Jose Lora
Systems Architect
Meredith Corporation
|