Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 02 Dec 2003 @ 13:39:00 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: Space occupied by indexes
 
From:   Vivek Pandey

Rachana Agarwal wrote :

  Can anyone tell me how can we know the total amount of space occupied by an index(primary and secondary) on a database table.  



I write :

1. In Teradata Primary Index do not need any extra space.

2.1 For USI

The following formula can be used to estimate size of a USI subtable.

p = Number of data rows in the base table - same as USI subtable

k = Length (in bytes) of a fixed length USI value (or the average length of a variable USI value)

po = Presence bit overhead (ceiling): ((1 + number of nullable USI fields) / 8); if none, 0

vo = Variable length field overhead: (number of variable length USI fields + 1) * 2.


For example

Row length              2
USI RowID               8
Spare0                  1
Presence                1
Offsets                 6
Base Table RowID        8
Reference Array         2
                     ----
                       28

USI subtable (no fallback) size = p * ( po + vo + k + 28)
USI subtable (fallback) size = 2p * ( po + vo + k + 28)

The offsets are used as following:

Offset[0] is the offset of the first byte for the USI value
Offset[1] is the offset where the base table RowID is found
Offset[2] is the offset after the base table RowID.


2.2 For NUSI...

The following information is needed to calculate the size of a NUSI subtable.

* (Row Count * 8) is derived from the 8 bytes of Row ID which the subtable stores for each row in the base table. This gives us the total number of bytes devoted to base table Row IDs.

* (# distinct values) is an estimate of the number of NUSI subtable rows since a NUSI subtable contains at least one index row per AMP for each distinct index value in the base table on that AMP.

* The 20 bytes of overhead per subtable row comes from the same place that the 28 bytes did for USIs. The offsets are used as following:

Offset[0] is the offset of the first byte for the NUSI value.
Offset[1] is the offset where the first base table RowID is found.
Offset[2] is the offset after the last base table RowID is found.

* MIN(NumAMPs, RowsPerValue) is the minimum of the two (see Case 1 and Case 2 below).


Case 1: NumAMPs < RowsPerValue
If there are fewer AMPs than RowsPerValue, at least one row from each NUSI value will probably be distributed to each AMP. This means the subtable on each AMP will contain a row for every NUSI value.

Case 2: NumAMPs > RowsPerValue
If there are more AMPs than RowsPerValue, at least some AMPs will be missing some NUSI values. This means that some AMPs will not have a subtable row for every NUSI value.


Thanks, Vivek.



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023