Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 30 Oct 2007 @ 12:02:16 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Generate unique id
 
From:   Dieter Noeth

Soumya Smita wrote:

  I have a table with fields acct_id, fd1, fd2, fd3.  


  I want a unique id generated for each unique combination of fd1, fd2, fd3. Acct_id is not considered for generating the unique id. I have millions of rows in my table, hence when I was using sum() over ... , I was getting spool space error.(May be due to sorting on the huge no. of rows involved.)  


It's sorting plus 2 spools for 1 OLAP step, you can't get rid of that as long as using OLAP.


  Plz let me know the best approach to get a unique_id , need not be based on any sorting criterion, considering that there are millions of rows in the table.  


  Desired unique_id's somethng like this:  


          > acct_id      fd1   fd2    fd3  uniq_id
          > 100          a     b      c     1
          > 200          a     b      c     1
          > 300          a     b      b     2
          > 400          a     a      b     3

It's not what i call unique, it's a DENSE_RANK :-) So you can't use ROW_NUMBER.


  Just that uniq_id should not take acct_id in consideration... and query should avoid sort on the rows.(Neways there is no desired sequence for the unique_ids)  


No sequence without ordering in Teradata...

If uniq_id must be ascending without gaps, then IDENTITY can't be used, if not simply use RANK.

This is probably the approach you described on Sep. 27 in "Check for existing attributes in a table". You could try too reduce the intermediate size by using NOT EXISTS to filter already existing rows. If there are lots of rows with more than one row per uniq_id you could calculate a RANK on a distinct set and then join back to that table. If the PI of that stage table is on (fd1, fd2, fd3) this would be quite fast:

     select
        s.*,
        rnk
     from stage_table s
     join
       (select fd1, fd2, fd3, rank() over (order by ...) as rnk
        from stage_table group by 1,2,3) dt
     on s.fd1 = dt.fd and ...

Of course the easiest way is to check if there's a skewed spool because of a bad ordering and correct that, if not increase spool for that load user.


Dieter



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023