Archives of the TeradataForum

Message Posted: Wed, 23 Jan 2002 @ 13:14:06 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Magic Wand Question

From:		Claybourne L. Barrineau

Due to the Teradata's parallel processing design, I feel Teradata's biggest problem is working with data which is very poorly distributed.

For instance, we have a problem involving 2 tables with the same NUPI. As it turns out, these 2 tables have poorly distributed data. When I join these 2 tables together on their respective NUPIs (the Optimizer choosing a merge join), the poor data distribution is amplified and some AMPs work ~400% harder than other AMPs.

I found a way of running the same query in half the time (with all AMPs working at the same rate) by choosing a new PI for the large table (with better data distribution) and forcing the Optimizer to duplicate the smaller table to all AMPs (I did this by creating a dummy NUPI with a default value of 1). With these new PIs, the Optimizer performed a hash join instead of a merge join.

What the Optimizer needs is a better way of detecting and avoid executing queries which will cause the AMPs to work at very uneven levels. Once these bad queries are detected, the Optimizer needs a way of executing a more efficient plan. In the scenario I described above, I'd like to be able to specify an alternate PI for these 2 tables (via stored as a Join Index or resolved (rehashed) at run time via some new feature) which the Optimizer would utilize to execute a more efficient plan.

My 2 Cents,

Claybourne Barrineau


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference