Archives of the TeradataForum

Message Posted: Fri, 30 Mar 2001 @ 17:17:48 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Multiload question

From:		Frank Martinez

Hi Pam (and the list friends to whom I am sending this as well),

As I understand it, the code that's generated by inTERAscript is basically doing a redefine on the input record. The record is processed on the mainframe, and the data parcel that's created with the reordered fields is sent to Teradata. I am not sure, but I think this is an in-memory operation, not something that's done with any file manipulation on the 'frame. But you said you saw the value of the field in question twice in the ET record, which makes sense because the ET shows the data parcel that Teradata tried to apply. The overhead should be the extra fields that are transmitted over the channel connect. I'm not quite sure how significant this would be, because I'm not entirely sure of the internals of Multiload processing.

But I do know that records written to the ET and UV files use normal SQL processing, not the special process that Multiload uses to write good records (applying 32K blocks, etc). I've had this wondrous revelation when we had a Multiload job that changed from running in =BD = hour to 6 hours+ when the client hadn't bothered to make sure they had new data to run (UV table was enormous). I would assume that the impact here, other than increased network traffic from bigger data parcels, would only be noticeable if there were a large amount of errors.

Anyway, Pam, I'm also sending this out to the Teradata discussion list so that other people can check my thinking. Hope that's of help.

Frank C. Martinez IV

Pam Braunschweig 03/29/01 03:46PM

When I use Multiload and a row is rejected (et_ table), there is an column in the et_ table called HostData. In trying to decipher this column, I found that the HostData is in the order the data was defined in the .LAYOUT section. It is not in the order of the input extract file.

Do you know if the data is reordered before or after it is sent from the mainframe to the Teradata?

The reason I ask is because inTERAscript generates the following .LAYOUT entries for decimal data.

     .FIELD IN_ClerkID_NULL                           32 CHAR( 2);
     .FIELD IN_ClerkID                                32 DECIMAL( 3, 0)
              NULLIF IN_ClerkID_NULL = 'FF'XB ;

We are not using the NULLIF feature and don't really need this type of testing. I left it in because it was too much trouble to remove. Looking at the HostData column in the et_ table, the above code puts the value in HostData twice. If the data is reordered before being transmitted from the host to the Teradata, then there is a lot of redundant data. It won't make much difference for small files. But tables like ClaimHeader with 2 million+ rows, it could make a difference.

This is not a big point, don't spend a lot of time researching. I was just curious.


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference