Archives of the TeradataForum
Message Posted: Mon, 12 May 2003 @ 21:03:16 GMT
Subj: | | Re: Indexes and multiload |
|
From: | | Terry Stover |
Teradata is a big change coming from other rdbms's, secondary indexes provide alot less value than you would see on Oracle or SQLServer.
It's more likely that poor performance is related to primary indices than a lack of secondaries.
Take a look at the table skew and the hash collisions, more likely you're putting all the data on one amp, or your running too many dupe
checks on your incoming rows. On the indexing front to index the heck out of the dimension / lookup tables and review your primary indexes.
Try collecting stats on the columns where you are considering secondaries. Indexes will only hurt multiload.
I recently ran an analysis of 6 months worth of query usage and found vast majority fact table nusi's were being used more 10% of the
time, and half were never used (assuming you can trust the explains). IIRC, the rule of thumb is that TD will only use the secondary if
there is < 1 row per block for the index value. Here's a heuristic I use to explain nusi's to non-TD folks: on SQLserver an index needs to
be 90% selective (return <10% of the rows) to be used by the optimizer. In TD it needs to be 90% selective at the amp level. If you have
20 amps on your system, the index value has to represent < 0.5% of the rows.
With the other rdbms's you can solve alot of performance problems with indexes, a decent rule of thumb is to index all the foreign key
columns (at least the frequently used ones). With Teradata most of the performance is in the primary indexes.
Secondaries will cost you 8+ bytes per row which can be substantial. Performance with multiload seems to see an even bigger hit. We
recently removed 2 of 5 nusi's from a 2 billion row table and our load rate went up 3x.
|