Archives of the TeradataForum

Message Posted: Thu, 05 Aug 2010 @ 10:15:00 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Field update running slowly

From:		DWellman

Hi Paul,

ParEff = 4% !!! I 'suspect' that's your problem :-) (or at least a large part of it).

This is caused by the Primary Index of the table and the data. Basically, most of the rows in the table have the same data values for the PI column(s).

Assuming that the data is correct... you'll need to change the PI of the table.

Issue the following SQL command:

     SHOW TABLE database.table;

Near the end of the output will be (something like) the following:

     PRIMARY INDEX ( column1, column2 )

Run the following SQL command (adjust it for the names and columns involved):

     SELECT column1,column2
           ,count(*) as rowcount
     From database.table
     Group by column1,column2
     Order by rowcount desc;

I think you'll find that there are one (probably) or maybe two values/combinations of values with a very high number of rows (i.e. the first one or two rows of the result set will represent the bulk of your 180K rows). This really just proves where the problem lies.

To get the data more evenly distributed across the AMPs the column(s) used for the PI need to have reasonably evenly distibuted data values. Run the following SQL command one or more times using other column(s) from the table:

     SELECT column3
           ,count(*) as rowcount
     From database.table
     Group by column3
     Order by rowcount desc;

What you're looking for is a roughly even distribution of rows across all data values.

Also, (as per someone else's post), avoid changing the PI to the column that you're trying to update. That can slow things down a lot.

There are slightly more sophisticated ways of doing this, but let's keep it simple for now.

Changing a table's PI is ** technically ** very simple:

- create a new table with the required PI

- execute an INSERT/SELECT from the old table to the new one

- drop the old table

- rename the new one as required

However, when changing the PI you need to be aware of how this table is used in conjunction with other tables on the system.

Ideally, joins between two tables are done using equality joins between all of the columns in the PI's of both tables with both tables having the same PI. Obviously this is not always possible but that's the ideal.

If this table is just a 'staging' or work table used during etl processing and has a single, final join into a real table (i.e. one that's part of an EDW data model) then it's probably less important.

Do the above and see where you get to.

Cheers,

Dave

Ward Analytics Ltd - Information in motion ( www.ward-analytics.com )


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference