Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 12 Dec 2002 @ 20:31:16 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: SkewFactor in WIN DDI
 
From:   Frank C. Martinez IV

It's a reflection of the way the table is distributed on the amps. I use it as a quick guide to tables I should look at with my own tools to see "skewedness". My own tools for determining skewedness start with the following queries:

SELECT COUNT(*)
     FROM STG_ORDER_LINE;

/*****
 * NAME: ATM Statistics
 * PURPOSE: This calculates the maximum and average number of records
 * per value and the number of distinct values, for use in index
 * selection analysis.
 *****/
SELECT MAX(TheCount)   AS MaxRecs,
       AVG(TheCount)   AS AvgRecs,
       COUNT(*)        AS DistinctVals

       FROM (SELECT COUNT(*) AS TheCount
                    FROM STG_ORDER_LINE
                    GROUP BY sbu_id, eff_dt) AS Temp;

/*****
 * NAME:  Distribution - Mode
 * PURPOSE: This determines the "mode" of the distribution of a
 * specific index.
 * NOTE:  The mode is the score in the population that occurs most
 * frequently.
 * (Internet Glossary of Statistical Terms:
 *       http://www.animatedsoftware.com/statglos/sgmode.htm)
 *****/
SELECT NumberOfRecords, Occurs
     FROM (SELECT NumberOfRecords, COUNT(*) AS Occurs
                FROM (SELECT COUNT(*) AS NumberOfRecords
                           FROM STG_ORDER_LINE
                           GROUP BY sbu_id, eff_dt) AS Temp
                GROUP BY 1) AS Temp2
     QUALIFY RANK(Occurs) = 1;

/*****
 * NAME:  Distribution - Record Count by AMP
 * PURPOSE: This calculates where records would be placed on the
 * AMPs, based on a specific index.
 *****/
SELECT HASHAMP(HASHBUCKET(HASHROW(sbu_id, eff_dt))) AS AmpNo,
       COUNT (*)
     FROM STG_ORDER_LINE
     GROUP BY 1
    ORDER BY 2 DESC;

/*****
 * NAME:  Distribution - Exceeds 200
 * PURPOSE: This shows the distribution of records with 200 or more
 * records per specific index.
 *****/
SELECT NumberOfRecords, COUNT(*) AS Occurs
     FROM (SELECT COUNT(*) AS NumberOfRecords
                FROM STG_ORDER_LINE
                GROUP BY sbu_id, eff_dt) AS Temp
     WHERE NumberOfRecords > 199
     GROUP BY 1
     ORDER BY 1 DESC;

But I do use the WINDDI skewed number, along with the amount of data in the table and how much the table is used to give me a "feel" for whether as a dba I'd better watch that one for causing hot amps or similar headaches.

iv



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023