Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Wed, 29 Sep 2004 @ 14:40:07 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Extra secondary index on large table
 
From:   Pandey, Vivek

Sizing USIs
-----------

The following formula provides you with a means of calculating the amount of space required for a Unique Secondary Index.

Since there is one USI row for every base table row, the Row Count is the same as the number of rows in the base table. The Index Value Size is the size of the column(s) that the index is on and varies depending on the USI. The 28 bytes in the formula include components listed on the facing page.

The following formula can be used to estimate size of a USI subtable.

p = Number of data rows in the base table - same as USI subtable

k = Length (in bytes) of a fixed length USI value (or the average length of a variable USI value)

po = Presence bit overhead (ceiling): ((1 + number of nullable USI fields) / 8); if none, 0

vo = Variable length field overhead: (number of variable length USI fields + 1) * 2.


For e.g.

     Row length              2
     USI RowID               8
     Spare0                  1
     Presence                1
     Offsets                 6
     Base Table RowID        8=09
     Reference Array 2
                          -------
                            28

     USI subtable (no fallback) size = p * ( po +  vo + k + 28)
     USI subtable (fallback) size    = 2p * ( po +  vo + k + 28)

     The offsets are used as following:

             Offset[0] is the offset of the first byte for the USI value
             Offset[1] is the offset where the base table RowID is found
             Offset[2] is the offset after the base table RowID.

Sizing NUSIs
-------------

The following information is needed to calculate the size of a NUSI subtable.

* (Row Count * 8) is derived from the 8 bytes of Row ID which the subtable stores for each row in the base table. This gives us the total number of bytes devoted to base table Row IDs.

* (# distinct values) is an estimate of the number of NUSI subtable rows since a NUSI subtable contains at least one index row per AMP for each distinct index value in the base table on that AMP.

* The 20 bytes of overhead per subtable row comes from the same place that the 28 bytes did for USIs. The offsets are used as following:

Offset[0] is the offset of the first byte for the NUSI value.

Offset[1] is the offset where the first base table RowID is found.

Offset[2] is the offset after the last base table RowID is found.


* MIN(NumAMPs, RowsPerValue) is the minimum of the two (see Case 1 and Case 2 below).


Case 1: NumAMPs < RowsPerValue

If there are fewer AMPs than RowsPerValue, at least one row from each NUSI value will probably be distributed to each AMP. This means the subtable on each AMP will contain a row for every NUSI value.


Case 2: NumAMPs > RowsPerValue

If there are more AMPs than RowsPerValue, at least some AMPs will be missing some NUSI values. This means that some AMPs will not have a subtable row for every NUSI value.



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023