Archives of the TeradataForum
Message Posted: Wed, 29 Mar 2000 @ 23:56:34 GMT
The tranid sounds like a good choice for primary index because of it's uniqueness. It will distribute nicely, and will prevent the need for dupe row checking by the system during data loading (assuming tranid is the unique primary index). As mentioned by others, the tranid doesn't sound like something that users will be able to use for querying very often. Therefore your need for good secondary index selection. Keep in mind that secondary indexes hurt data maintenance performance, while possibly helping query performance.
I would ask several questions before choosing the secondary indexes. What is the uniqueness of the data contained in candidate columns? The old rule of thumb was that a secondary index would be ignored by the optimizer if it didn't qualify down to less than 10% of the table. Over the years that has changed. It must be more selective now than before because of enhancements in full table scan capabilities, such as larger block sizes, improved caching, etc. An additional criteria to consider is whether or not a user can/will specify all columns of a multi-column index. If all columns are not included in the where clause the index cannot be selected by the optimizer. Even if the columns are all specified, if one of them uses a BETWEEN the index won't be used. (example they specify region, store and date between x and y) Therefore, in your examples it may be better to have separate single column indexes for store, region, and date. If the user specifies more than one of them in the where clause the optimizer may choose to build a bit map index to take advantage of the greater selectivity of the combined indexes. I notice that you have specified the same columns in more than one index because of the possibility of different combinations. This is another example of when to use single column indexes, and allowing the optimizer to choose. Of course, it goes without saying that to take full advantage of indexes you need to collect statistics on every index, and on any other column which is frequently used as a constraint or join criteria.
If you have sufficient space, make a copy of the table (or a sufficient subset of it to be valid) into a work space, and experiment with different secondary indexes. Run explains with a set of sample queries to see if the optimizer uses the indexes. When you're satisified, you can make the modifications to the production table.
I hope that helps.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 15 Jun 2023|