Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 05 May 2004 @ 11:39:29 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: ETL: Push vs Pull
 
From:   Meade, Cornelius

With regards to pulling metadata with the data and using that data to dynamically modify/build the target model, it is possible however that path still has its potholes. If the data being pulled is to feed additional ETL processes then the changes required may be quite complex and difficult to "automatically" generate or predict. Just think what you would need to do to dynamically generate a conversion process for data already sourced from a particular table in the event it suddenly changed in some material way. Likewise, imagine the confusion of the users of such an environment when objects suddenly change without prior notice. The point here is that you will still need to manage the way change is introduced from the source environment(s) into the target environment regardless of the ETL strategy.

That said however, for certain situations this approach can be workable. We happen to have just such a situation in which the target database is essentially a direct clone (i.e. 1-1) of the one being sourced. In this case the source system is a large one, in largely 3rd normal form, having hundreds of discrete tables, not to count relationships, indexes, etc. We pull metadata about each tables structure/definition at the same time that its data is extracted and this data is used to dynamically build the extract script, the target table ddl/dml as well as the load script for the table in question. By avoiding the historical data question, and completely rebuilding each table in full each extract cycle, most of the potentially complex/daunting ETL scenarios are avoided and the resulting process is manageable and handles most changes to the source system(s) with minimal intervention. In this case that means we don't have to spend a lot of time re-working ETL processes but you still have to expend about the same amount of time communicating with the owners of the source systems and understanding what changes are taking place in their environments. Likewise, this doesn't significantly lessen the amount of time spent making sure our users are prepared for the changes to appear in the target environment they use.....



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023