Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 22 Aug 2015 @ 01:07:14 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Case logic optimization
 
From:   Bakthavachalam, Roopalini

Dieter - Let me explain this in a bit more detail. Here is the complete problem query. Any direction in optimizing this will be very helpful.

     INSERT INTO RBAKTHA.vwMEDSTG
     (
         Col1,col2,col3........col35
     )
     SELECT
        distinct
             col1,col2....col35
     FROM RBAKTHA.viewMEDFPOP AS MED
     INNER JOIN
     ( SELECT
     ID,
     HCI,
     Code,
     CASE WHEN CODE IN ('NV','CO') THEN '2006-10-01' ELSE '2006-01-01' END AS
     Servcdate1
     FROM RBAKTHA.viewIDMem
     ) AS  ELIG
     ON MED.HCI  = ELIG.HCI
     WHERE Servcdate >= Servcdate

The query that I sent in the first email is actually the viewMEDFPOP (mentioned below as well). I was trying to make it to a VOLATILE table instead to see if I can get better performance and then use it in the INSERT query. One more big issue is the INNER JOIN in the INSERT query. The viewIDMem has 59M records that is joined with viewMEDPOP which has 5.5 B records. There is sliding window merge join that is happening.There are 159M records out of 5.5 B that are duplicates( row level duplicates).

PI of viewMEDFPOP base table - HCI, Servdate

PI of viewIDMem base table - HCI

I was trying to sort the records in both the tables, thinking that will be faster, however it worsened the performance.

     REPLACE VIEW VIEWMEDFPOP AS
     SELECT
     distinct
         0 as ID,
         HCI,
         date,
         Type,
         optin,
         D1,
         D2,
         D3,
         D4,
         .
         .
         .
         .
         .
         CASE WHEN POP = 'A' THEN 1
             WHEN POP = 'B' THEN 2
             WHEN POP = 'C' THEN 3
             WHEN POP = 'D' THEN 5
         END AS Flag,
         1 As TFlag
     FROM viewMED

Below is the EXPLAIN on the VIEWMEDFPOP.

Explain on the VIEWMEDFPOP

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct RBAKTHA."pseudo table" for read on a RowHash to prevent global deadlock for RBAKTHA.MEDCL_PRM.  
  2)Next, we lock RBAKTHA.MEDCL_PRM in view vwMEDICAL for read.  
  3)We do an all-AMPs SUM step to aggregate from RBAKTHA.MEDCL_PRM in view vwMEDICAL by way of an all-rows scan with no residual conditions, and the grouping identifier in field 1. Aggregate Intermediate Results are computed locally, then placed in Spool 1. The input table will not be cached in memory, but it is eligible for synchronized scanning. The aggregate spool file will not be cached in memory. The size of Spool 1 is estimated with no confidence to be 4,139,311,952 rows (2,197,974,646,512 bytes). The estimated time for this step is 2 hours and 17 minutes.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 2 hours and 17 minutes.

BEGIN RECOMMENDED STATS ->
 
  5)"COLLECT STATISTICS COLUMN (PRSN_ID ,ICD_VRSN_CD ,DIAG_1, DIAG_1_LN , DIAG_2 ,DIAG_2_LN ,DIAG_3 ,DIAG_3_LN ,DIAG_4, DIAG_4_LN ,DIAG_5 , DIAG_6 ,DIAG_7 ,DIAG_8 ,CLM_LINE_SRVC_STRT_DT, HLTH_SRVC_CD , PRNCPL_ICD_PROC_CD_1 ,ICD_PROC_2_CD ,ICD_PROC_3_CD, ICD_PROC_4_CD , ICD_PROC_5_CD ,ICD_PROC_6_CD ,ICD_PROC_7_CD, ICD_PROC_8_CD ,RNDRG_PROV_ID , CMS_SPCLTY_CD ,FNL_DRG_CD, FUNDG_TYPE_CD ,RSRCHBL_ID ,SRCFLAG , HCI_TOS) ON RBAKTHA.MEDCL_PRM". (LowConf)
<- END RECOMMENDED STATS
 



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 23 Jun 2019