Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Mon, 18 Sep 2006 @ 22:49:46 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Partition Benchmarking Issues
 
From:   Doshi, Roopali

Hi,

We are doing bench marking on Teradata Partition to determine its impact on query performance improvement. However, we are finding some strange behavior with partition of tables and query performance. There are 2 main issues found. 1) Partitioned not considered in join 2) Partition not considered when used CASE statement in SQL WHERE clause.

The detail explanation is written below. I am hoping that some of you may be able to provide some input.

Issue 1 - Partition not considered in join

While joining ods1_shipment_status_test partitioned table (which is having partition defined on ods1_status_type_id column) with ods1_shipment_stop and ods1_shipment_type tables. The query does not show any performance improvement with paritions or without partition. The execution time is almost same for both query 5.02mins.


Table definition

     dvd1_t.ods1_shipment_status_test

     CREATE MULTISET TABLE dvd1_t.ods1_shipment_status_test ,NO FALLBACK ,
          NO BEFORE JOURNAL,
          NO AFTER JOURNAL,
          CHECKSUM = DEFAULT
          (
           ods1_shipment_id DECIMAL(9,0) NOT NULL,
           ods1_status_type_id DECIMAL(4,0) NOT NULL,
           status_val_gid CHAR(60) CHARACTER SET LATIN NOT CASESPECIFIC NOT
     NULL COMPRESS ('VEH1.APPROVED
     ','VEH1.AWAITING_CALL_OFF_NO
     ','VEH1.AWAITING_CALL_OFF_YES
     ','VEH1.CLOCK NOT TICKING
     ','VEH1.CLOCK TICKING
     ','VEH1.COMPLETE
     ','VEH1.CONTROLLED DELIVERY OFF
     ','VEH1.CONTROLLED DELIVERY ON
     ','VEH1.EVENT MISSING
     ','VEH1.EVENT SEQUENCE ERROR
     ','VEH1.INCOMPLETE MISSING ACTUAL
     ','VEH1.NO MSE STATUS
     ','VEH1.NOT RECEIVED
     ','VEH1.NOT SHIPPED
     ','VEH1.OSV HOLD NO STATUS
     ','VEH1.OSV HOLD OFF
     ','VEH1.OSV HOLD ON
     ','VEH1.RECEIVED
     ','VEH1.RECEIVED MANUAL
     ','VEH1.SHIPMENT HOLD OFF
     ','VEH1.SHIPMENT HOLD ON
     ','VEH1.SHIPMENT NOT MONITORED
     ','VEH1.SHIPMENT ON TIME
     ','VEH1.SHIPMENT ONE DAY LATE
     ','VEH1.SHIPMENT TWO DAYS LATE
     ','VEH1.SHIPPED
     ','VEH1.SHIPPED MANUAL
     ','VEH1.SHP TRAF HOLD OFF
     ','VEH1.SHP TRAF HOLD ON
     ','VEH1.STATUS OK                                              '),
           last_mod_user CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT
     NULL COMPRESS ('VEH1.EAI            ','VEH1.RXGOEHRICK     '),
           updt_dt TIMESTAMP(6) COMPRESS ,
           insert_dt TIMESTAMP(6) NOT NULL,
           last_mod_load_instance_id DECIMAL(9,0) NOT NULL,
           dw_delete_dt DATE FORMAT 'yyyy-mm-dd' COMPRESS )
     PRIMARY INDEX ods1_shpmt_stat_nupi ( ods1_shipment_id )
     PARTITION BY CASE_N(
     ods1_status_type_id =  63 ,
     ods1_status_type_id =  65 ,
     ods1_status_type_id =  66 ,
     ods1_status_type_id =  67 ,
     ods1_status_type_id =  70 ,
     ods1_status_type_id =  71 ,
     ods1_status_type_id =  72 ,
     ods1_status_type_id =  73 ,
     ods1_status_type_id =  77 ,
     ods1_status_type_id =  78 ,
     ods1_status_type_id =  342 ,
     NO CASE OR UNKNOWN)
     UNIQUE INDEX ods1_shipment_status_pk ( ods1_shipment_id
     ,ods1_status_type_id );

Total number of records in dvd1_t.ods1_shipment_status_test :- 78,080,999


     dvd1_t.ods1_shipment_stop

     CREATE MULTISET TABLE dvd1_t.ods1_shipment_stop ,NO FALLBACK ,
          NO BEFORE JOURNAL,
          NO AFTER JOURNAL,
          CHECKSUM = DEFAULT
          (
           ods1_shipment_id DECIMAL(9,0) NOT NULL,
           stop_nbr DECIMAL(4,0) NOT NULL,
           domain_nm CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL
     COMPRESS ('GMAT      ','MAT1      ','VEH1      '),
           load_instance_id DECIMAL(9,0) NOT NULL,
           dw_delete_dt DATE FORMAT 'yyyy-mm-dd' COMPRESS ,
     CONSTRAINT ods1_shpmt_stp_shpmt_fk FOREIGN KEY ( ods1_shipment_id )
     REFERENCES WITH NO CHECK OPTION DVD1_T.ODS1_SHIPMENT ( ods1_shipment_id
     ))
     PRIMARY INDEX ods1_shpmt_stp_nupi ( ods1_shipment_id )
     UNIQUE INDEX ods1_shipment_stop_pk ( ods1_shipment_id ,stop_nbr );
     Total number of records in dvd1_t.ods1_shipment_stop :- 47380141
     dvd1_t.ods1_status_type
     CREATE MULTISET TABLE dvd1_t.ods1_status_type ,NO FALLBACK ,
          NO BEFORE JOURNAL,
          NO AFTER JOURNAL,
          CHECKSUM = DEFAULT
          (
           ods1_status_type_id DECIMAL(9,0) NOT NULL,
           status_type_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,

           object_type VARCHAR(80) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
           descr VARCHAR(500) CHARACTER SET LATIN NOT CASESPECIFIC,
           domain_nm CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL
     COMPRESS ('GMAT      ','MAT1      ','VEH1      '),
           sequence DECIMAL(2,0) NOT NULL,
           is_external_ind CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
           xid VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
           updt_user VARCHAR(128) CHARACTER SET LATIN NOT CASESPECIFIC,
           updt_dt TIMESTAMP(6) COMPRESS ,
           insert_user VARCHAR(128) CHARACTER SET LATIN NOT CASESPECIFIC,
           insert_dt TIMESTAMP(6),
           load_instance_id DECIMAL(9,0) NOT NULL,
           update_load_instance_id DECIMAL(9,0) COMPRESS ,
           dw_delete_dt DATE FORMAT 'yyyy-mm-dd' COMPRESS )
     UNIQUE PRIMARY INDEX ods1_status_type_pk ( ods1_status_type_id );
     Total number of records in dvd1_t.ods1_status_type :- 343
     Query with partition table
     locking row for access
     select
             shp_stat.*
     from
             dvd1_t.ods1_shipment_stop shp_stop,
             dvd1_t.ods1_shipment_status_test shp_stat,
             dvd1_t.ods1_status_type stat_typ
     where
             shp_stop.ods1_shipment_id = shp_stat.ods1_shipment_id
             and shp_stat.ods1_status_type_id =  stat_typ.ods1_status_type_id
             and shp_stop.stop_nbr=2
             and stat_typ.status_type_gid='VEH1.SHP COMPLETE'

Execution Time :- 00:05:02

Output record count :- 8672050


EXPLAIN PLAN

Explanation
--------------------------------------------------
 
  1)First, we lock dvd1_t.shp_stat for access, we lock dvd1_t.shp_stop for access, and we lock dvd1_t.stat_typ for access.  
  2)Next, we do an all-AMPs RETRIEVE step from dvd1_t.shp_stop by way of an all-rows scan with a condition of ("dvd1_t.shp_stop.stop_nbr = 2.") into Spool 2 (all_amps), which is built locally on the AMPs. The size of Spool 2 is estimated with no confidence to be 4,740,640 rows. The estimated time for this step is 19.91 seconds.  
  3)We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a RowHash match scan, which is joined to dvd1_t.shp_stat by way of a RowHash match scan with no residual conditions. Spool 2 and dvd1_t.shp_stat are joined using a merge join, with a join condition of ("ods1_shipment_id = dvd1_t.shp_stat.ods1_shipment_id"). The input table dvd1_t.shp_stat will not be cached in memory. The result goes into Spool 3 (all_amps), which is redistributed by hash code to all AMPs. The result spool file will not be cached in memory. The size of Spool 3 is estimated with no confidence to be 5,211,485 rows. The estimated time for this step is 1 minute and 40 seconds.  
  4)We do an all-AMPs JOIN step from dvd1_t.stat_typ by way of an all-rows scan with a condition of ("dvd1_t.stat_typ.status_type_gid = 'VEH1.SHP COMPLETE'"), which is joined to Spool 3 (Last Use) by way of an all-rows scan. dvd1_t.stat_typ and Spool 3 are joined using a product join, with a join condition of ("ods1_status_type_id = dvd1_t.stat_typ.ods1_status_type_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 529,739 rows. The estimated time for this step is 7.00 seconds.  
  5)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 2 minutes and 6 seconds.  


Query without partition table

     locking row for access
     select
             shp_stat.*
     from
             dvd1_t.ods1_shipment_stop shp_stop,
             dvd1_t.ods1_shipment_status shp_stat,
             dvd1_t.ods1_status_type stat_typ
     where
             shp_stop.ods1_shipment_id = shp_stat.ods1_shipment_id
             and shp_stat.ods1_status_type_id =  stat_typ.ods1_status_type_id

             and shp_stop.stop_nbr=2
             and stat_typ.status_type_gid='VEH1.SHP COMPLETE'

Execution Time :- 00:05:02

Output record count :- 8672050


EXPLAIN PLAN

Explanation
--------------------------------------------------
 
  1)First, we lock dvd1_t.shp_stat for access, we lock dvd1_t.shp_stop for access, and we lock dvd1_t.stat_typ for access.  
  2)Next, we do an all-AMPs JOIN step from dvd1_t.shp_stop by way of a RowHash match scan with a condition of ("dvd1_t.shp_stop.stop_nbr = 2."), which is joined to dvd1_t.shp_stat by way of a RowHash match scan with no residual conditions. dvd1_t.shp_stop and dvd1_t.shp_stat are joined using a merge join, with a join condition of ("dvd1_t.shp_stop.ods1_shipment_id = dvd1_t.shp_stat.ods1_shipment_id"). The input tables dvd1_t.shp_stop and dvd1_t.shp_stat will not be cached in memory, but dvd1_t.shp_stop is eligible for synchronized scanning. The result goes into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. The result spool file will not be cached in memory. The size of Spool 2 is estimated with no confidence to be 30,780,788 rows. The estimated time for this step is 3 minutes and 2 seconds.  
  3)We do an all-AMPs JOIN step from dvd1_t.stat_typ by way of an all-rows scan with a condition of ("dvd1_t.stat_typ.status_type_gid = 'VEH1.SHP COMPLETE'"), which is joined to Spool 2 (Last Use) by way of an all-rows scan. dvd1_t.stat_typ and Spool 2 are joined using a product join, with a join condition of ("ods1_status_type_id = dvd1_t.stat_typ.ods1_status_type_id"). The result goes into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 3,128,817 rows. The estimated time for this step is 41.26 seconds.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 3 minutes and 43 seconds.  


Issue 2 - Partition not considered if we use case statement in where clause

Created partition on ods1_ob_status_history_test table on date field event_dt. When we hard code the date in the where clause then it's using partition and query response is fast.

However, if we use 'CASE' statement in where clause then it's not using partition and response is slower than the query without partition. Here are the details

Table Definition

     CREATE MULTISET TABLE dvd1_t.ods1_ob_status_history_test ,NO FALLBACK ,
          NO BEFORE JOURNAL,
          NO AFTER JOURNAL,
          CHECKSUM = DEFAULT
          (
           ods1_order_basis_id DECIMAL(9,0) NOT NULL,
           i_transaction_no DECIMAL(10,0) NOT NULL,
           reporting_gluser VARCHAR(128) CHARACTER SET LATIN NOT CASESPECIFIC,
           bs_event_group_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           bs_contact_nm VARCHAR(120) CHARACTER SET LATIN NOT CASESPECIFIC,
           bs_contact_function_cd VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           event_dt TIMESTAMP(6),
           bs_status_cd_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           servprov_alias_val VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           tm_zone_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           status_level VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
           ob_refnum_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           bs_status_reason_cd_gid VARCHAR(101) CHARACTER SET LATIN NOT CASESPECIFIC,
           domain_nm CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL
     COMPRESS ('GMAT      ','MAT1      ','VEH1      '),
           insert_user VARCHAR(128) CHARACTER SET LATIN NOT CASESPECIFIC,
           insert_dt TIMESTAMP(6),
           updt_user VARCHAR(128) CHARACTER SET LATIN NOT CASESPECIFIC,
           updt_dt TIMESTAMP(6),
           load_instance_id DECIMAL(9,0) NOT NULL,
           dw_delete_dt DATE FORMAT 'yyyy-mm-dd' COMPRESS ,
           ods1_svc_dvd1r_alias_qlfr_id DECIMAL(4,0),
           ods1_order_refnum_qual_id DECIMAL(9,0),
           updt_load_instance_id DECIMAL(9,0) COMPRESS )
     PRIMARY INDEX ordr_bss_stat_hist_nupi ( ods1_order_basis_id )
     PARTITION BY RANGE_N(CAST((event_dt ) AS DATE) BETWEEN DATE '1992-01-01'
     AND DATE '2003-12-31',
     DATE '2004-01-01' AND DATE '2004-12-31' EACH INTERVAL '1' YEAR ,
     DATE '2005-01-01' AND DATE '2005-12-31' EACH INTERVAL '1' YEAR ,
     DATE '2006-01-01' AND DATE '2006-12-31' EACH INTERVAL '1' YEAR ,
     DATE '2007-01-01' AND DATE '2007-12-31' EACH INTERVAL '1' YEAR ,
     DATE '2008-01-01' AND DATE '2008-12-31' EACH INTERVAL '1' YEAR ,
     DATE '2009-01-01' AND DATE '2009-12-31' EACH INTERVAL '1' YEAR ,
     NO RANGE OR UNKNOWN)
     UNIQUE INDEX ob_status_history_pk ( ods1_order_basis_id
     ,i_transaction_no );

Total number of records in dvd1_t.ods1_ob_status_history_test :- 78080999

Query with hard coded date

     select *
     from dvd1_t.ods1_ob_status_history_test
     where event_dt >'2006-01-01 00:00:00'

Execution Time :- 00:00:57

Output record count :- 25831065


EXPLAIN PLAN

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct dvd1_t."pseudo table" for read on a RowHash to prevent global deadlock for dvd1_t.ods1_ob_status_history_test.  
  2)Next, we lock dvd1_t.ods1_ob_status_history_test for read.  
  3)We do an all-AMPs RETRIEVE step from 5 partitions of dvd1_t.ods1_ob_status_history_test with a condition of ("dvd1_t.ods1_ob_status_history_test.event_dt > TIMESTAMP '2006-01-01 00:00:00.000000'") into Spool 1 (group_amps), which is built locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is estimated with no confidence to be 26,368,902 rows. The estimated time for this step is 8 minutes.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 8 minutes.  


Query with case statement

     select *
     from dvd1_t.ods1_ob_status_history_test
     where     event_dt >= CASE
             WHEN    extract(month FROM      CURRENT_DATE) <4
                    THEN CAST((TRIM(extract(year FROM        CURRENT_DATE)
     -1) || '-10-01 00:00:00') AS TIMESTAMP(6))
             ELSE CAST((TRIM(extract(year FROM       CURRENT_DATE) ) ||
     '-01-01 00:00:00') AS TIMESTAMP(6))
             END

Execution Time :- 00:02:06

Output record count :- 25831065


EXPLAIN PLAN

Explanation
--------------------------------------------------
 
  1)First, we lock a distinct dvd1_t."pseudo table" for read on a RowHash to prevent global deadlock for dvd1_t.ods1_ob_status_history_test.  
  2)Next, we lock dvd1_t.ods1_ob_status_history_test for read.  
  3)We do an all-AMPs RETRIEVE step from all partitions of dvd1_t.ods1_ob_status_history_test with a condition of ("dvd1_t.ods1_ob_status_history_test.event_dt >= (( CASE WHEN ((EXTRACT(MONTH FROM (DATE )))< 4) THEN ((TRANSLATE(((TRIM(BOTH FROM (EXTRACT(YEAR FROM (DATE )))- 1 (VARCHAR(11), CHARACTER SET LATIN, NOT CASESPECIFIC, FORMAT '-(10)9')))||'-10-01 00:00:00')USING LATIN_TO_UNICODE)) AS TIMESTAMP(6))) ELSE ((TRANSLATE(((TRIM(BOTH FROM EXTRACT(YEAR FROM (DATE))(VARCHAR(11), CHARACTER SET LATIN, NOT CASESPECIFIC, FORMAT '- (10)9')))||'-01-01 00:00:00')USING LATIN_TO_UNICODE)) AS TIMESTAMP(6))) END ))") into Spool 1 (group_amps), which is built locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is estimated with no confidence to be 26,368,902 rows. The estimated time for this step is 5 minutes and 40 seconds.  
  4)Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  
  ->The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 5 minutes and 40 seconds.  


Regards,

Roopali Doshi
Enterprise Datawarehouse Group



     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023