|
|
Archives of the TeradataForum
Message Posted: Wed, 05 May 2004 @ 09:18:11 GMT
Subj: | | Re: ETL: Push vs Pull |
|
From: | | Anomy Anom |
<-- Anonymously Posted: Tuesday, May 04, 2004 22:24 -->
| One problem I've seen with the PULL strategy occurs when the source database is being changed frequently: since the source DB developers
don't "own" the extraction process, they may not think about the impact of their changes on it and may not communicate change information to the
DW side. | |
We too have seen this too. Lots of hours have been spent running around analyzing incoming data and fixing code.
| An approach which is working well for us is a "PUSH" extraction which automatically generates control files describing the data files
(record count, field names and formats). The DW loading process can check these against expected values and immediately catch any unanticipated
format changes. | |
We recently implemented this with a 3rd party, and it works pretty well. We export data to them, with the metadata indicated above. They use
our metadata to check our feed.
Then, they turn around and send the data (with some changes) back to us, this time with their own metadata. Our loading process then runs a
check against their metadata.
One possible extension of this process, is that the DW loading process could automatically detect any format changes via the source metadata,
and ALTER the target model accordingly.
-Anomy
| |