Archives of the TeradataForum
Message Posted: Tue, 19 Apr 2011 @ 10:20:59 GMT
Select CL.CL_ID, CL.CLAIM_NBR , CL.CLAIM_ID, CL.DIAG_CD , MB.MBR_NBR , MB.MBR_NAME FROM CLAIM CL INNER JOIN MEMBER MB ON CL.MB_ID = MB.MB_ID WHERE CL.SVC_DT BETWEEN
--700 Million Records
--History maintained for 10 Years
--CLAIM_NBR is the natural key from the Source systems and CL_ID is a surrogate KEY which is unique --CLAIM_NBR can repeat in multiple instances of the CLAIM and CL_ID will always be Unique.
--CLAIM table has lot of other attributes (Close to 100 columns)
2 Million Records
History maintained for last 10 Years
--MBR_NBR is the natural key from the Source systems and MB_ID is a surrogate KEY which is unique --MBR_NBR can repeat in multiple instances of the MEMBER and MB_ID will always be Unique.
I would need some advice from the experts on a hypothetical problem I am trying to understand. Above Query is a very frequenttly used query which joins two Tables CLAIMS and MEMBER. I have to design the table structures for CLAIMS and MEMBER in way that this query performs optimally. Following are the two kind of decisions which comes to my mind. Please suggest which is the better one and also tell me if there can be a better approach.
Set the MB_ID as PRIMARY INDEX in both CLAIM and MEMBER tables to avoid Data re-distribution.
Partition the Primary Index in PPI way with Data partitioned on SVC_DT on Daily basis. This way the table will have approx. 10 Yrs X 365 Days = 3650 Partitions in all. This will avoid the use of a Secondary index will help in avoiding FTS in range queries.
Choosing a MB_ID primary index on CLAIMS table may have a negative impact on the Data Storage across AMPS since not all Members might have submitted claims so might end up with SKEW.
Set the MB_ID as PRIMARY INDEX of MEMBER Set the CL_ID as PRIMARY INDEX of CLAIM Create a Join Index on CL.MB_ID = MB.MB_ID Partition the Primary Index in PPI way with Data partitioned on SVC_DT on Daily basis. This way the table will have approx. 10 Yrs X 365 Days = 3650 Partitions in all. This will avoid the use of a Secondary index will help in avoiding FTS in range queries.
Any suggestions would be appreciated.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|