|
|
Archives of the TeradataForum
Message Posted: Fri, 22 Aug 2008 @ 15:25:24 GMT
Subj: | | Skewed Table - Space usage v/s AMP Colocation |
|
From: | | Goli, Srikanth |
Hi -
If a table is skewed on just one AMP, but is well distributed otherwise, is it recommended to tweak the PI for space consideration? Consider
Order (PK-Order_Id; PI - Order_ID) , Order Line (PK - Order_ID, Line_ID; PI - Order_ID) and a couple of other Order Line child tables with the
same PI, Order ID. The primary index is chosen as Order_Id for Order Line and its child tables to get joining rows on the same AMP as the order
table. However if an order has unusually large number of lines, say 100, whereas the others have evenly 10 order lines, the AMP that has the 100
line order will be skewed, by a factor of 10 compared to the average. The impact will be magnified, since the child records of this Order Line
will also be on the same AMP, since they all are hashed on Order. So hypothetically, if these were the only table in the system, as soon as the
AMP that contains this Order is full, regardless of the space on the other AMPs the system is full. I am assuming the optimizer may still handle
this well, because it has the stats to account for the one row skew, but no way out of the storage penalty. Is this a fair statement?
I will probably make Order_ID, Order_Line_Id as the Primary Index for Order Line and its child tables, but will have lost ability to join with
the Order on the same AMP. This seems like the middle ground of the performance and space considerations. Are there any other means to address the
'few row' skewed tables, so that we do not give up so much on the performance side for the storage considerations?
I appreciate any suggestions, comments.
Thanks.
| |