Archives of the TeradataForum

Message Posted: Fri, 25 Jan 2003 @ 04:29:37 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Help on Tpump usage

From:		rmhsmd

Comments embedded.

----- Original Message -----
From: "Sankar Arunachalam"
Sent: Friday, January 24, 2003 7:10 AM
Subject: Help on Tpump usage

we have recently installed Tpump in our NT test system. And currentlty testing it for using it in our production loads during the active window instead of the nightly batch jobs. I would like to hear from your experiences about the doe's and don'ts and good ways of deploying it etc.,

Some of my questions already on the lists are:

1. how do you find the critical cut off volume for using Tpump vs. Mload.

The usual tradeoff is business latency requirements v. resource consumption. TPump is near real time. MDL resource consumption reduces as block hit rate increases which requires larger batches. Anita Richards' Partners' Presentation "A Cross-Comparison of Load Strategies" is an excellent reference for understanding the resource consumption.

2. How do you determine the best sql rate / buffer size?

Maximize the pack to minimize round trips and minimize Teradata, client, and network resource consumption. V2R5 increases columns per request from 507 to 2560. (Concrete steps may impose lower limits).

3. Is there a way to 'Accept' a variable value from NT environment variable without using 'ACCEPT var from FILE'... Do any one know why the Teraata guys take away this feature in Tpump while this is available in MLOAD?

4. Is there a way to stop TPUMP from reporting duplicate rows on unique primary index. (as you know ignore duplicate rows on insert/update doesn't help).

Duplicate inserts to a UPI or non-multiset table will fail. In any multi-statement SQL app such as TPump, the entire request fails. The app must 'remove' and log the bad input and resubmit the rest.

5. Though the documentation says that there is no limit on the number of concurrent TPUMP jobs that can be run , I experience difficulties when tested with 5concurrent jobs loaded data to the same table. Only 2 or 3 of the 5 jobs seeams to run togethere while the rest of them wait for the completion of the other jobs.

Within a job, TPump 'serialize' avoids contention and preserves order. What is the contention? Why run >1 job?


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference