Archives of the TeradataForum

Message Posted: Wed, 03 Mar 2010 @ 13:05:20 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Data Skew vs Single Amp

From:		Barry Hull

I have designed tables with up to 20,000 rows per hash value for the same reason. For the type of access and the number of users, it was the optimal way to go, and that was 8 years ago.

It will depend on your row size and your table size. If your row size is very large, then you could have an issue with how long it takes to read that many blocks of data for one PI value. I would design your blocksize around your average rows per hash value. So, if your average rows per hash value is closer to the 50 thousand, then you'd want to make the blocksize as large as possible. If the average is closer to 12, then you'll want to go with a smaller blocksize. Also, make it MULTISET if possible to avoid the duplicate row checking. Finally, if your table is huge (billions of rows) and you have severe skewing due to the PI, then it could steal a lot of space from your system. In my case, my table was not that big (< 100 million rows), so a little skewing was not a big deal.

Another option if you want to have a more balanced table but still have the PI access would be to create a join index that had the PI that you needed for single-AMP access. Then, you could include only those columns needed for the query.

Hope that helps.

Barry Hull
Orange Diamond Consulting, LLC


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference