|
|
Archives of the TeradataForum
Message Posted: Wed, 03 Mar 2010 @ 13:05:20 GMT
Subj: | | Re: Data Skew vs Single Amp |
|
From: | | Barry Hull |
I have designed tables with up to 20,000 rows per hash value for the same reason. For the type of access and the number of users, it was the
optimal way to go, and that was 8 years ago.
It will depend on your row size and your table size. If your row size is very large, then you could have an issue with how long it takes to
read that many blocks of data for one PI value. I would design your blocksize around your average rows per hash value. So, if your average rows
per hash value is closer to the 50 thousand, then you'd want to make the blocksize as large as possible. If the average is closer to 12, then
you'll want to go with a smaller blocksize. Also, make it MULTISET if possible to avoid the duplicate row checking. Finally, if your table is
huge (billions of rows) and you have severe skewing due to the PI, then it could steal a lot of space from your system. In my case, my table was
not that big (< 100 million rows), so a little skewing was not a big deal.
Another option if you want to have a more balanced table but still have the PI access would be to create a join index that had the PI that you
needed for single-AMP access. Then, you could include only those columns needed for the query.
Hope that helps.
Barry Hull
Orange Diamond Consulting, LLC
| |