Archives of the TeradataForum
Message Posted: Wed, 30 Mar 2000 @ 00:33:05 GMT
Hi Barbara, Another often overlooked performance enhancing feature is "COLLECT STATISTICS".
You should "collect statistics" on each of the indexes you define (including the PI). The frequency that you need to collect statistics will vary depending upon how often (and how much) the data in the table changes. You should do it at least once a month when you delete the old data. Statistics gives the optimizer some hints on how the data is organised so that it can produce better query plans.
If you are running V2R3, you can see the effect of statistics in the explain's (eg. "... the size of spool N is estimated to be x rows with XXXX confidence" where XXXX is no, low or high). Most of the time, if not always, if the optimiser can estimate with high confidence it will produce better execution plans resulting in better performance. High confidence estimates are obtained when statistics are collected.
P.S. Right on Rick and Mark. Reading between their lines there is a balance between Uniqueness of the PI and minimising distribution. You need to have a reasonably unique PI (ideally a Unique PI) to get good data distribution (and therefore maximise the opportunity for parallel processing), on the other hand if you can join two large tables which are defined using the same PI you won't need to redistribute data as much. A simple example might be an account table (UPI on account #) and a transaction table (NUPI on account number, PK account_Number&transaction_ID). On the surface it seems like you've hit on the right UPI, but as mentioned in some of the other posts, there are other factors to consider.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|