|
Archives of the TeradataForumMessage Posted: Tue, 17 Jun 2003 @ 18:22:27 GMT
I had an aha moment last night, and found I could eliminate one of the tables (p2) from my original solution and simply calculate the between range from the p1 values. It surprised me, though, because it runs about 20% slower than the original on my 125K rows of test data. I suspect this is because the data is sparse enough the the p1/p2 join is eliminating a lot of rows up front. If the data were very dense, the new query might run faster because of one less join. Anyway, here's the shortened version: select distinct p1.acct from cust as p1 inner join cust as p3 on p3.phone between p1.phone and p1.phone+4 and p3.acct = p1.acct group by p1.acct, p1.phone having count(*) = 5 ; I can't remember who first raised the issue of MDIFF, but I took a look at that and it works very well on V2R4.1.3. It's about three times faster that the join versions, though you have to live with the present non-ANSI syntax if you want to use it: select acct from (select p1.acct, mdiff(p1.phone,4,p1.acct,p1.phone)+1 from cust as p1 group by p1.acct) dt(acct, xdiff) where xdiff = 5 group by acct ; There have been a number of solutions proposed that call out every entry in the phone number space explicitly using conditions and/or joins with phone, phone+1, phone+2, ..., phone+4. I checked a couple of them, and they run about the same speed as the BETWEEN join varients, so I'd skip them because they're hard to expand if the requirements change. Dieter N�th solved the changing requirements problem very neatly in his solution (my favorite), which returns the exact span sizes in the COUNT(*). If the limit changes from 5 to 10, only the constant in the HAVING clause needs to be changed. select distinct acct from ( select acct ,phone ,diff /** this is the trick: calculate groups of rows, so consecutive rows with a diff of 0 will be part of the same datagroup **/ ,sum(diff) over (partition by acct order by phone rows unbounded preceding) as datagroup from ( select acct ,phone /** moving difference - 1 between two numbers, so consecutive rows have a diff of 0 **/ ,coalesce(mdiff(phone, 1, phone)-1, 0) as diff from cust group by acct ) dt ) dt group by acct, datagroup having count(*) >=5 ; The continuing discussion on this problem has been interesting; my thanks to all who have contributed. /dave hough
| ||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | ||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | ||||||||||||||||||||||||||||||||||||||||||||||||