![]() |
|
Archives of the TeradataForumMessage Posted: Tue, 17 Jun 2003 @ 18:22:27 GMT
I had an aha moment last night, and found I could eliminate one of the tables (p2) from my original solution and simply calculate the between range from the p1 values. It surprised me, though, because it runs about 20% slower than the original on my 125K rows of test data. I suspect this is because the data is sparse enough the the p1/p2 join is eliminating a lot of rows up front. If the data were very dense, the new query might run faster because of one less join. Anyway, here's the shortened version:
select distinct
p1.acct
from
cust as p1
inner join
cust as p3
on p3.phone between p1.phone and p1.phone+4 and
p3.acct = p1.acct
group by
p1.acct, p1.phone
having
count(*) = 5
;
I can't remember who first raised the issue of MDIFF, but I took a look at that and it works very well on V2R4.1.3. It's about three times faster that the join versions, though you have to live with the present non-ANSI syntax if you want to use it: select acct from (select p1.acct, mdiff(p1.phone,4,p1.acct,p1.phone)+1 from cust as p1 group by p1.acct) dt(acct, xdiff) where xdiff = 5 group by acct ; There have been a number of solutions proposed that call out every entry in the phone number space explicitly using conditions and/or joins with phone, phone+1, phone+2, ..., phone+4. I checked a couple of them, and they run about the same speed as the BETWEEN join varients, so I'd skip them because they're hard to expand if the requirements change. Dieter Nöth solved the changing requirements problem very neatly in his solution (my favorite), which returns the exact span sizes in the COUNT(*). If the limit changes from 5 to 10, only the constant in the HAVING clause needs to be changed.
select distinct
acct
from
(
select
acct
,phone
,diff
/** this is the trick:
calculate groups of rows, so consecutive rows with a diff of 0
will be part of the same datagroup **/
,sum(diff) over (partition by acct
order by phone
rows unbounded preceding) as datagroup
from
(
select
acct
,phone
/** moving difference - 1 between two numbers,
so consecutive rows have a diff of 0 **/
,coalesce(mdiff(phone, 1, phone)-1, 0) as diff
from cust
group by acct
) dt
) dt
group by acct, datagroup
having count(*) >=5
;
The continuing discussion on this problem has been interesting; my thanks to all who have contributed. /dave hough
| ||||||||||||||||||||||||||||||||||||||||||||||||
| | ||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||
| Copyright 2016 - All Rights Reserved | ||||||||||||||||||||||||||||||||||||||||||||||||
| Last Modified: 15 Jun 2023 | ||||||||||||||||||||||||||||||||||||||||||||||||