Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 01 Nov 2007 @ 10:46:49 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Problem with handling null values for Partition
 
From:   The TeradataForum

Good day,

Could anyone please provide some suggestions for the following mload-related problem? We would really be grateful for a solution to this problem which is presently a big headache for us.

We have encountered a problem with the mload utility inserting duplicate records for nullable columns where existing records contain nulls in any of the logical Primary Key (PK) columns. Obviously, to do a proper upsert via mload, one needs to put all the PK columns in the "where" portion of the update clause, irrespective of what the physical Primary Index (PI) might be.

The problem stems from the fact that update statements that, for example, contain:

     Where ColumnA = :columnA
     and columnB   = :ColumnB

cannot handle nulls because nulls can only be matched using "is null". The equality operator does not create a match.

In our case we have a MULTISET table that has a NUPI based on a single column (Account_Num, which is not nullable). The table has a PK consisting of multiple columns (some of which are nullable for very specific business reasons). The partition index for the table is on column (Account_Summary_Dt, which is nullable), and this column is part of the PK, so it needs to be part of the "where" clause of the update statement.

The solution we had in mind was to use the coalesce function to ensure that the "where" clause of the update portion of the mload can handle nulls for nullable columns, as follows:

     where Account_Num = :Account_Num(Dec(18, 0) , Format '999999999999999999.')
     and coalesce(Account_Modifier_Num, -32768) = coalesce(:Account_Modifier_Num, -32768)
     and coalesce(SB_Account_Open_Dt, date '9999-12-31') = coalesce(:SB_Account_Open_Dt, '9999-12-31')(Date ,Format 'YYYY-MM-DD')
     and coalesce(Account_Summary_Dt, date '9999-12-31') = coalesce(:Account_Summary_Dt, '9999-12-31')(Date ,Format 'YYYY-MM-DD')

(Account_Num is not nullable, and does not need a coalesce)

The coalesce function works for all the columns in the above example, except the partition column (Account_Summary_Dt). (I suspect any nullable columns that are part of the PI might also be affected but this does not apply to this specific example, because Account_Num is the only PI column).

We get the following error when using "coalesce" on the partition column:

3538: A MultiLoad UPDATE Statement is Invalid.


When we eliminate the coalesce only from the partition column, as follows from:

     and coalesce(Account_Summary_Dt, date '9999-12-31')
             = coalesce(:Account_Summary_Dt, '9999-12-31')(Date ,Format 'YYYY-MM-DD')

to:

     and Account_Summary_Dt = :Account_Summary_Dt (Date ,Format 'YYYY-MM-DD')

then the error dissappears, but we are not handling possible null values (on either side of the "=").

We would really appreciate any suggestions on how to solve this issue (without having to totally redesign our existing ETL methodology).

If there is no direct technical solution to this problem, we would have to rethink our approach to our ETL, LDM and PDM (we are using the FSLDM from Teradata), especially with regards to nullable columns. Unfortunately this will mean a lot of rework on our ETL, LDM, PDM and Datastage design.

We have unsuccesfully tried various other approaches such as:

A) OR ColumnA is null AND :ColumnA is null (and variations thereof, using cast, etc)

B) Case statements


We have to accommodate nulls in our physical tables because from the beginning we are fully rolling out the FSLDM to prevent continuous future rework. However, not all columns (including PK and FK columns) can be immediately populated because all our source systems are not ready with all the data yet. Therefore we need nullable columns. We do not want to use default values, because this could change the meaning of our data (for example: birthdate, if populated with default values, could impact on our cutomer age profiles).


Kind regards

Etienne Stieger
Analytics and Information Management (AIM/EIW) Standard Bank of South
Africa Ltd


PS: Partition column as defined on table

     PARTITION BY RANGE_N(Account_Summary_Dt  BETWEEN DATE '2003-01-01' AND
     DATE '2003-12-31' EACH INTERVAL '1' MONTH , DATE '2004-01-01' AND DATE
     '2004-12-31' EACH INTERVAL '1' MONTH , DATE '2005-01-01' AND DATE
     '2005-12-31' EACH INTERVAL '1' MONTH , DATE '2006-01-01' AND DATE
     '2006-12-31' EACH INTERVAL '1' MONTH , DATE '2007-01-01' AND DATE
     '2007-12-31' EACH INTERVAL '1' MONTH ,  NO RANGE OR UNKNOWN);


     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023