|
|
Archives of the TeradataForum
Message Posted: Thu, 12 Jun 2008 @ 10:33:13 GMT
Subj: | | Re: Partial field comparison |
|
From: | | McLeod, Terry M |
This will work, with the exception that you shouldn't expect to match "dog" to "d_og" without doing some preliminary clean up as Martin
suggested in his email -- in which case "dog" can find "dog". As he suggests, this can be done on a separate table.
SELECT A.ANIMAL, A.COLOR, A.PETSTOREID, B.PETSTOREID, B.COMMENT1
FROM TABLE_A A
JOIN TABLE_B B
ON POSITION(TRIM(A.ANIMAL) IN B.COMMENT1) IS NOT NULL
AND POSITION(TRIM(A.ANIMAL) IN B.COMMENT1) > 0
AND POSITION(TRIM(A.COLOR) IN B.COMMENT1) IS NOT NULL
AND POSITION(TRIM(A.COLOR) IN B.COMMENT1) > 0
WHERE A.PETSTOREID <> B.PETSTOREID;
Performance-wise, I have no information for you, since I only worked with your sample data, which did not include full table information,
such as primary index information. You could ramp up some tests if you have space for sample data (i.e. pull over a couple 1000 recs from each
table and try that, then 10,000, etc.). I don't think you can get around this being all amp operation with big spools, since the two tables don't
have any columns in common on the rows that you want to see matched - hence there seems to be no way to make them amp local.
Hope this will be of some help.
By the way, although if the animal and color columns on table_a are defined as VARCHAR, you needn't trim in the query, you may want to anyway
since if someone entered an explict couple of leading spaces, for example, in one of those columns, then it wouldn't be found in a comment that
did not have those spaces. Also note that the "not null" is in case either of the two columns used by the function is null. If that's not
possible (due to NOT NULL on both columns on their respective tables) then you could drop the not null check.
Terry M. McLeod
TERADATA Professional Services, Retail - West
| |