Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 18 Aug 2003 @ 21:14:00 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Assigning destination tags to CDRs
 
From:   Dieter Noeth

ulrich arndt wrote:

  When I saw your mail it reminds me on the problem how to give a macro a �unlimited� number of parameter to check for in a table.  


A common solution is to use LIKE, but this is rather slow if the table is large and only a small part of it is returned. A better solution is splitting the comma delimited string into a derived table and then join to it.


  The underlying idea of my suggestion is to have two different prefix reference tables. Number one contains per prefix one row with the prefix and the tag you what to set and any other info. Number two contains only one row and as many columns as the lengths of the prefix can be. Write into each corresponding column all prefixes with the same lengths with a blank before and after the prefix. You can than join the CDR table with the prefix_join table via some like and not like conditions. The benefit of this approach is that you do not have to evaluate which prefix has the biggest lengths (see also SQL's below). What you you think? Is this a reasonable approach?  


Sounds similar to my huge CASE statement and can easily be automatically created. The only limitation is the maximum row size.


  I am not able to do a volume test with this approach but would be interested in case you try this approach with your data. As only one row for the prefix evaluation is needed I would expect that this row is redistributed to all AMPs and that joined to the CDR's which means a less I/O and more CPU operation.  


I tried different variations on a my small test table with ~73000 CDRS and 2000 prefixes, but the optimizer typically liked to product join prefix_ref and cdr and it was hard to convince him not to do so ;-) So the plans may be totally different on different systems and different table sizes...

The best plan was ~ 1/3 CPU compared to 2.3 and few IOs, but 5x CPU compared to 4. CPU was mainly due to the LIKE.

I modified your query a bit:

There's no need for NOT LIKE, because this has alredy been tested in the prior WHEN. And no need for the case in WHERE.

select
   c.msisdn,
   case
     when p.prefix_3 like '% ' !! (substring(c.msisdn from 1 for 8)) !!
' %'
       then (substring(c.msisdn from 1 for 8))
     when p.prefix_2 like '% ' !! (substring(c.msisdn from 1 for 6)) !!
' %'
       then (substring(c.msisdn from 1 for 6))
     when p.prefix_1 like '% ' !! (substring(c.msisdn from 1 for 4)) !!
' %'
       then (substring(c.msisdn from 1 for 4))
   end as pre,
   r.prefix_desc
from    cdr c,
         prefix_join p,
         prefix_ref r
where r.prefix = pre
;

To preserve non matching msisdn:

from    cdr c cross join
         prefix_join p left join
         prefix_ref r
on r.prefix = pre
;

And moving the cross join into a derived table helped the optimizer a lot:

select
   msisdn,
   pre,
   r.prefix_desc
from
(
select
   c.msisdn,
   case
     when p.prefix_3 like '% ' !! (substring(c.msisdn from 1 for 8)) !!
' %'
       then (substring(c.msisdn from 1 for 8))
     when p.prefix_2 like '% ' !! (substring(c.msisdn from 1 for 6)) !!
' %'
       then (substring(c.msisdn from 1 for 6))
     when p.prefix_1 like '% ' !! (substring(c.msisdn from 1 for 4)) !!
' %'
       then (substring(c.msisdn from 1 for 4))
   end as pre
from    cdr c cross join
         prefix_join p) dt
left join
         prefix_ref r
on r.prefix = pre
;

Thanks,

Dieter



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023