Re: TPUMP Blocking on multiple sessions [From LeBlanc, James: Tue, 04 Apr 2006 @ 20:57 GMT]

Message Posted: Tue, 04 Apr 2006 @ 20:57:59 GMT

Subj:

Re: TPUMP Blocking on multiple sessions

From:

LeBlanc, James

On separate thread and a related topic I received these excerpts from our documentation. I am offering these excerpts to help clarify how and why TPump can get into locking conditions based on database design. I would like to avoid taking responsibility for this content. Know that this material comes to me from best sources.

Chapters 9 and 10 of Database Design manual recommend paying attention to this during physical design and provide SQL to detect skew/synonyms. It does explicitly say that character and byte types tend to be poor choices. Chapter 13 of the Performance Management manual also discusses also listing approaches to detecting skew/synonyms.

Here are a few excerpts from the manual that appear to explain this issue, recommending paying attention to it, and providing tools and techniques to do so:

Chapters 9 and 10 of Database Design
Conversely, if you specify too few columns in the primary index definition, then each primary index value might correspond to a large number of rows, a situation that often not only causes data skew, but also degrades any data maintenance that must touch all rows in a row hash.

If you are analyzing the demographics of an existing table, whether production or prototype, you can use the following set of useful scripts written by the Teradata technical support team to check for data skew. You can adjust the details of these statement to suit the needs of your site.

This is a good practice to undertake when new applications are being loaded on the system. It is also good practice to run these queries regularly if there are many data changes. he following query identifies tables that are not evenly distributed.

The principal problem faced by hashing is hash collisions: situations in which the row hash value for different rows is identical, making it difficult for a system to discriminate among the hash synonyms when one unique row is requested for retrieval from a set of hash synonyms.

Database Design, Chapter 10.l section Locating a Row Using Its Primary Index has a graphic showing different primary index values with the same rowhash.

SQL Reference Functions and Operators describes the hash row functions and says this:

HASHROW is particularly useful for identifying the statistical properties of the current primary index, or to evaluate these properties for other columns to determine their suitability as a future primary index. You can also use these statistics to help minimize hash synonyms and enhance the uniformity of data distribution.

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Tue, 04 Apr 2006 @ 20:57:59 GMT