Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 19 Jan 2001 @ 13:56:25 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Unique Value Algorithm
 
From:   Claybourne Barrineau

Glenn and Frank,

I tried the CSUM approach to this issue, and it seems to work fine with a small amount of data.

However, I was planning to build about 46 million records, and the CSUM approach apparently redistributes all the data (even though my source and target file have the same primary index.)

     explain
     Insert into INFVOLTSTT.T_F_VOL_ATOMIC_BDL1
     (
     Uniq_Code,
     Bas_Artl_Nbr,
     Btlr_Org_ID,
     Bsns_Lctn_ID,
     Setl_Dt,
     Phys_Case_Qty,
     Grs_Rvnu_Amt,
     Artl_Org_ID
     )
     Select    csum(1, Setl_Dt, Bas_Artl_Nbr, Bsns_Lctn_ID),
          Bas_Artl_Nbr,
          Btlr_Org_ID,
          Bsns_Lctn_ID,
          Setl_Dt,
          Phys_Case_Qty,
          Grs_Rvnu_Amt,
          Artl_Org_ID
     From XEIWTMPT.TBDL700_Clay;

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct INFVOLTSTT."pseudo table" for write on a RowHash to prevent global deadlock for INFVOLTSTT.T_F_VOL_ATOMIC_BDL1.  
  2)Next, we lock a distinct XEIWTMPT."pseudo table" for read on a RowHash to prevent global deadlock for XEIWTMPT.TBDL700_Clay.  
  3)We lock INFVOLTSTT.T_F_VOL_ATOMIC_BDL1 for write, and we lock XEIWTMPT.TBDL700_Clay for read.  
  4)We do an all-AMPs STAT FUNCTION step from XEIWTMPT.TBDL700_Clay by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is assumed to be redistributed by value to all AMPs. The result rows are put into Spool 3, which is built locally on the AMPs.  
  5)We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 1 by row hash. The size of Spool 1 is estimated with no confidence to be 0 row. The estimated time for this step is 0.05 seconds.  
  6)We do a MERGE into INFVOLTSTT.T_F_VOL_ATOMIC_BDL1 from Spool 1 (Last Use).  
  7)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->No rows are returned to the user as the result of statement 1.  


I ended up using the HASHROW function that Frank Martinez recommend (Thank You!) This approach seems to be the best approach for a large amount of data. No data redistribution took place (if in fact that was the cause of the problem in the CSUM aproach).


Thank you all for your help and input.

Claybourne Barrineau



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023