Archives of the TeradataForum
Message Posted: Thu, 18 Feb 2005 @ 00:35:09 GMT
The order makes no difference with respect to the result set, but the order *may* make a difference in performance. We wrote an incident (RECAC8YTR) in late 2001 that called this a bug, but according to NCR it's an ANSI feature. The ANSI SQL standard for NOT IN apparently *requires* that result filtering occur column by column, left to right.
This means that if you have columns of different uniqueness (say 10 unique values for 'a' and 1,000,000 for 'b'), it's very possible to have changes in performance even though the explains remain the same. Experimentation showed that listing the columns in decreasing order of uniqueness (1M, 10) produces the best results. I have a complex test case that runs in seconds if (a,b) and hours if (b,a) on V2R5.1.0.
Rather than fiddling with the list order (which is vulnerable to changing data demographics), I recommend that people replace the NOT IN with NOT EXISTS. Correlated subqueries do not have the same problem, though how you instruct a front end tool to make the change I don't know.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|