Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 30 Jun 2003 @ 23:30:45 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: BMSMS
 
From:   Matthew Winter

Hi,

Sorry about that, I mixed my terminology up.

What I was attempting to say (based on my limited understanding), is that when BMSMS is performed, basically the filters specified are used to calculate the ROW HASH values, which are then used to retrieve the ROWID values from each of the NUSI's. The values returned are then put through the bitmapping.

Now the reason for doing this, is to turn two possible low selective indexes into a single high selective index.

Now this still holds true when a small table is joined to one of the indexes. As shown in the example, the CALENDAR table will return a single record based on the filter. The ROW HASH value will then be calculated from the join column, and used to retrieve the ROWID values from the NUSI index.


  From what I can see, there is no reason why if a large table existed with secondary indexes place on each of the foreign keys. When a query is then executed, joining the large table to a series of small tables, via the foreign keys, which indexes have been created on. Why the RDBMS could not perform the small table join to each of the secondary index tables, creating a spool file for each, which would in effect contain the ROWID and any selected columns from small table. Then each of these spool tables could in effect be joined together into one via the ROWID after some form of bitmapping had occured.  


Now this would only work for small table large table join scenerios, which have been joined using INNER JOIN's. But to me it would seem a very good optimisation for anyone doing star schema type designs.

Another approach to this, would be to use JOIN INDEX's, where for the example I gave prior to this, we would create single table join indexes on the following select statements:

SELECT (C), (A) FROM TEST_TABLE;

SELECT (D), (A) FROM TEST_TABLE;


In the above I have used he primary key, rather than ROWID, since it is an INTEGER value, it would take up less physical space, and therefore scan quicker.

If each of the join indexes was distributed on the leading value, e.g. (C) and (D), then the small tables would be AMP-located to the join index and therefore join quickly. Once joined the optimiser would automatically know that the (A) within the join index is the Primary Key, or alternatively using the ROWID. Redistribute the data accordingly, join the two small tables together via the (A) value, before performing an AMP-Local join to the large table.

In theory if the main table, was large enough, then this process could be done for many joining tables, reducing the requirement to redistribute the main table onto what could be a highly skewed distribution.

Now I am sure that all of this has been thought of in the past, but I would be interested to know if it is possible.

I appologise now for my ranting, but when provided a soap box....


Regards

Matthew Winter



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023