Archives of the TeradataForum

Message Posted: Mon, 22 Jul 2013 @ 10:24:01 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Issues with MERGE query

From:		Malli Chandrasekaran, Saravanan

Hello All,

I'm experiencing some issues with Merge queries. I've two tables of which structures given below,

Table #1:

     C1 INT -- Primary Unique Index -- Account #
     C2 DECIMAL(16,0) -- Secondary Unique Index
     C3 INT - NULL -- Cell/Bucket
     C4 INT - ROW_NUMBR

Points: One-to-one relationship between C1 and C2. C3 will be NULL initially. I'm having issues with updating this field. I'll explain my issue in detail in a while.

Table #2:

     C1 INT -- Primary Non-Unique Index
     C2 DECIMAL(16,0)
     C5 INT -- Segment

Points: One-to-one relationship between C1 and C2. However, there is one-to-many relationship between C1/C2 and C5. We will not have any duplicate rows in this table.

Requirement: For the first segment, I will have to choose Accounts from Table #1 that have a set of Segments AND another set of segments AND another set of segments AND another set of segments AND NOT IN another set of segments in C5 of Table #2. For example,

     Cell/Bucket   Include_Set1     Include_Set2     Include_Set3          Include_Set3      Exclude_Set   HardCap

          1      1000,1001,1002   1010,1011,1012   1020,1021,1022   1030,1031,1033,1034   1040,1041,1042    111111

          2      2000,2001,2002   2010,2011,2012   2020,2021,2022   2030,2031,2033,2034   2040,2041,2042    300000

          3      3000,3001,3002   3010,3011,3012   3020,3021,3022   3030,3031,3033,3034   3040,3041,3042    400000

          4                      Take 50% of the population from Cell #3 and assign it to Cell #4 .

          .
          .
          .
          .

        100         ...........      ...........      ...........           ...........   ..............

The number of values in the "include sets" can vary upto 1000 values and We will have upto 100 Cells.

Scripts I'm using to accomplish this task are,

1. Drop tables 1 and 2. Re-populate Table #1 with data and Set C3 which is Cell# to NULL for all records. Repopulate Table #2 with data. Table #2 will be stable until this process is done. We will keep updating C3 of Table #1 throughout this process.

2. Running an auto-process to create sql queries and store them in a sql file

3. Call the sql file through a ksh script by using the command bteq < Qry.sql > Qry_1.log

Scripts in the sql file:

     .logon db-name/userid,pwd;
     .SET ERROROUT STDOUT;

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     MERGE INTO 
     USING ( SELECT TOP 111111 C1_A -- Selecting only top 111111 accounts from the down-stream output
                FROM ( SELECT DISTINCT D.C1 AS C1   -- Selecting Accounts which satisfy all required segments as per the requirement
                          FROM ( ( ( SELECT A.C1
                                        FROM  A
                                             INNER JOIN  B
                                                   ON A.C1 = B.C1
                                                   AND A.C3 IS NULL
                                                   AND B.C5 IN (1000,1001,1002)
                                   )

                                   INTERSECT ( SELECT A.C1
                                                  FROM  A
                                                       INNER JOIN  B
                                                             ON A.C1 = B.C1
                                                             AND A.C3 IS NULL
                                                             AND B.C5 IN (1010,1011,1012)
                                             )

                                   INTERSECT ( SELECT A.C1
                                                  FROM  A
                                                       INNER JOIN  B
                                                             ON A.C1 = B.C1
                                                             AND A.C3 IS NULL
                                                             AND B.C5 IN (1020,1021,1022)
                                             )

                                   INTERSECT ( SELECT A.C1
                                                  FROM  A
                                                       INNER JOIN  B
                                                             ON A.C1 = B.C1
                                                             AND A.C3 IS NULL
                                                             AND B.C5 IN (1030,1031,1033,1034)
                                             )
                                 )

                                 EXCEPT ( ( SELECT A.C1
                                               FROM  A
                                                    INNER JOIN  B
                                                          ON A.C1 = B.C1
                                                          AND A.C3 IS NULL
                                                          AND B.C5 IN (1040,1041,1042)
                                          )
                                        )
                               ) D
                     ) E
           ) Z

           ON C1 = Z.C1_A
           AND C3 IS NULL
           WHEN MATCHED THEN UPDATE   SET C3 = 1
     ;

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     COLLECT STATISTICS ON  COLUMN (C1);

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     COLLECT STATISTICS ON  COLUMN (C3);

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     MERGE INTO 
     USING ( SELECT TOP 300000 C1_A -- Selecting only top 300000 accounts from the down-stream output
                FROM ( SELECT DISTINCT D.C1 AS C1   -- Selecting Accounts which satisfy all required segments as per the requirement
                            FROM ( ( ( SELECT A.C1
                                          FROM  A
                                               INNER JOIN  B
                                                     ON A.C1 = B.C1
                                                     AND A.C3 IS NULL
                                                     AND B.C5 IN (2000,2001,2002)
                                     )

                                     INTERSECT ( SELECT A.C1
                                                    FROM  A
                                                         INNER JOIN  B
                                                               ON A.C1 = B.C1
                                                               AND A.C3 IS NULL
                                                               AND B.C5 IN (2010,2011,2012)
                                               )

                                     INTERSECT ( SELECT A.C1
                                                    FROM  A
                                                         INNER JOIN  B
                                                               ON A.C1 = B.C1
                                                               AND A.C3 IS NULL
                                                               AND B.C5 IN (2020,2021,2022)
                                               )

                                     INTERSECT ( SELECT A.C1
                                                    FROM  A
                                                         INNER JOIN  B
                                                               ON A.C1 = B.C1
                                                               AND A.C3 IS NULL
                                                               AND B.C5 IN (2030,2031,2033,2034)
                                               )
                                   )

                                   EXCEPT ( ( SELECT A.C1
                                                 FROM  A
                                                      INNER JOIN  B
                                                            ON A.C1 = B.C1
                                                            AND A.C3 IS NULL
                                                            AND B.C5 IN (2040,2041,2042)
                                            )
                                          )
                                 ) D
                     ) E
           ) Z

           ON C1 = Z.C1_A
           AND C3 IS NULL
           WHEN MATCHED THEN UPDATE   SET C3 = 2
     ;

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     COLLECT STATISTICS ON  COLUMN (C1);

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     COLLECT STATISTICS ON  COLUMN (C3);

     -- Cell #4 would NOT be calculated at this point.
     ...
     ...
     ...
     After updating all 100 Segments...


     -- Assigning row numbers which have fallen into any of the buckets/cells.

     MERGE INTO 
     USING ( SELECT DISTINCT C3 AS C3_A
                , C1 AS C1_A
                , ROW_NUMBER() OVER(ORDER BY C3) ROW_NUM
                FROM  WHERE C3 IS NOT NULL
           ) Z
     ON C3 = Z.C3_A
     AND C3 IS NOT NULL
     AND C1 = Z.C1_A
     WHEN MATCHED THEN UPDATE SET C4 = Z.ROW_NUM
     ;

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     -- Splitting Cell #3 into 2

     MERGE INTO 
     USING ( SELECT DISTINCT C3 AS C3_A
                ,C1 AS C1_A
                ,ROW_NUMBER() OVER(PARTITION BY C3 ORDER BY C3) ROW_NUM
                FROM 
                WHERE C3 = 3
           ) Z
     ON C3 = Z.C3_A
     AND C3 = 3
     AND C1 = Z.C1_A
     WHEN MATCHED THEN UPDATE SET C3 = CASE WHEN Z.ROW_NUM MOD 2 = 1 THEN 3
                                            WHEN Z.ROW_NUM MOD 2 = 0 THEN 4
                                            END ;

     .IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;

     .LABEL ERROR_BOTTOM
     .logoff
     .quit

     -- End of the sql script...

Issue: This is a most business critical process. I have to run this process at least once a day. Most of the time, I get the correct counts. Sometimes, I get some weird counts even though the log file shows the correct count. For example, the log file showed the counts given below for Cell#1, last week.

     *** Merge completed. 111111 rows changed.
      *** Total elapsed time was 37 seconds.

However, when I checked the table, I saw 222222 rows for Cell #1. Whenever I run this process, TABLE#1 gets truncated and re-populated with data and C3 which is Cell# will be set to NULL. Then only, the KSH script gets called. I got the correct count when I just reran the process without making any changes in the SQL file.

This happens at least once a week! Sometimes, split does not happen correctly too. I just don't know what is happening here. I'm pretty sure that it has nothing to do with whatever data that I have in both tables. Otherwise, rerunning the process would NOT have given me the correct counts.

Details of Teradata I'm using :

     VERSION          13.10.07.03
     RELEASE         13.10.07.04
     LANGUAGE SUPPORT MODE   Standard

Has anyone come across this issue which is driving me crazy? Any suggestions please?

Thank you,

Saravanan


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference