Archives of the TeradataForum

Message Posted: Fri, 18 Feb 2005 @ 09:25:50 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Null values in a select

From:		Dieter Noeth

David Hough wrote:

The order makes no difference with respect to the result set, but the order *may* make a difference in performance. We wrote an incident (RECAC8YTR) in late 2001 that called this a bug, but according to NCR it's an ANSI feature. The ANSI SQL standard for NOT IN apparently *requires* that result filtering occur column by column, left to right.

Maybe you sould ask them to specify where it's exactly stated in the ISO document ;-)

When i read that SQL:1999 standard (but i have to admit that i hardly understand what they write), i can't see anything about an order of evaluation.

And even if there was, it still may be processed differently, as long as the result is the same.

But in the mentioned case it's even a difference in result set, and that's definitly not Standard SQL.

This means that if you have columns of different uniqueness (say 10 unique values for 'a' and 1,000,000 for 'b'), it's very possible to have changes in performance even though the explains remain the same. Experimentation showed that listing the columns in decreasing order of uniqueness (1M, 10) produces the best results. I have a complex test case that runs in seconds if (a,b) and hours if (b,a) on V2R5.1.0.

At least it's always the same order from left to right when you write joins, but within search conditions within WHERE it's quite strange: Most DBMSes (of course Oracle does it the other way) evaluate from left to right and they do it always the same way.

But Teradata uses both and i still don't know exactly when the optimizer chooses left to right vs. right to left.

     select * from dbc.logonoff
     where username = 'bla' and logdate = date '2005-02-18';

        with a condition of ("(DBC.EventLog.DateFld = DATE '2005-02-18')
        AND (DBC.EventLog.UserName = 'bla ')")


     select count(*) from dbc.logonoff
     where username = 'bla' and logdate = date '2005-02-18';

        with a condition of ("(DBC.EventLog.UserName = 'bla ')
        AND (DBC.EventLog.DateFld = DATE '2005-02-18')")


     select * from dbc.logonoff
     where username = 'bla' or logdate = date '2005-02-18';

        with a condition of ("(DBC.EventLog.UserName = 'bla ')
        OR (DBC.EventLog.DateFld = DATE '2005-02-18')")


     select count(*) from dbc.logonoff
     where username = 'bla' or logdate = date '2005-02-18';

        with a condition of "(DBC.EventLog.UserName = 'bla ')
        OR (DBC.EventLog.DateFld = DATE '2005-02-18')")

If there's a

     WHERE "mainly false expression" AND "mainly true expression"

     WHERE "simple expression" AND "complex expression"

it uses much more CPU if it's evaluated in the wrong order, i.e. from right to left.

It would be nice, if the optimizer would consider statistics to use the most efficient order.

Dieter


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference