Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 11 Nov 2014 @ 21:53:12 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: TABLE Function
 
From:   McCall, Glenn David

In short yes, you could use this to load data from one system to another. I have built such a Table UDF (I call it DbLinks) and that is exactly how we are using it. It is not commercially available right now but I've tested it against TD to TD transfers and Oracle to TD transfers and works quite well.

If there is enough interest I may put it on developer Exchange.

In answer to your other questions.

Q) If so then, assuming it from taking data from the Prod system down to the Test system, where would the resource be taken up - in the production machine or the test machine ? Or both possibly ?

Both. And to do it properly, you will need an intermediate server.

The architecture is as follows:

1) The table UDF runs on all AMP's

2) Each of the AMP's contact the external intermediate server

3) The external server runs the query against the remote system and retrieves the data.

4) Data retrieved from the remote system is "fanned out" to each of the AMP's (one row goes to amp 1, the next to amp 3, the next to amp 2 or whatever order they happen to get their turn).


With this architecture, the load on each of the AMP's is balanced. There is another approach where a single AMP runs the query on the remote system but this will result in severe skewing (one AMP is doing all the work) and would require installation of unsupported third party software (namely ODBC or similar drivers for the foreign DBMS) on each of the TPA nodes - which is not supported by customer services. In my architecture, no unsupported software is required on the third party nodes.

So to answer your question, some TD CPU time is consumed, some remote system time is consumed (to run the query that sources the data) and some time on an intermediate server is consumed.

A) Also, is it a BAD idea, even in principle ?

This will depend upon your viewpoint. Mine is that you have to run a query on a remote system to export the data, you have to land the data on a load server then you have to load that data from the load server to Teradata - so what's the real difference? Ah, you might say reading from the file is faster than reading from the LAN. Maybe, but you still have to transfer over the LAN to Teradata and export the data over the LAN in the first place.

Granted there will be some "dead time" for the AMP's while the remote query runs, but unless your system is close to being in flow control when you do this it probably won't be a big problem.

Having said that, I would limit the number of concurrent external data requests that you run to a small number e.g. <= 5.

For whatever it is worth, if you've heard of Query Grid, this is more or less what Query Grid is. But query grid is only available in TD 15 (or is it 15.10?)


Hope this helps

Glenn Mc



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023