Archives of the TeradataForum
Message Posted: Wed, 09 May 2001 @ 20:55:03 GMT
Here's a couple more SQL's for the same purpose:
/***** * NAME: Distribution - Record Count * PURPOSE: This calculates where records would be placed on the AMPs, based on a specific key. *****/ SELECT HASHAMP(HASHBUCKET(HASHROW(OrgCode, AgencyCode, FiscalYear, ObjectCode))) AS AmpNo, COUNT (*) FROM budget GROUP BY 1 ORDER BY 1; /***** * NAME: Distribution - Mode * PURPOSE: This determines the "mode" of the distribution of a candidate index. * NOTE: The mode is the score in the population that occurs most frequently. * (Internet Glossary of Statistical Terms: http://www.animatedsoftware.com/statglos/statglos.htm) *****/ SELECT NumberOfRecords, Occurs FROM (SELECT NumberOfRecords, COUNT(*) AS Occurs FROM (SELECT COUNT(*) AS NumberOfRecords FROM budget GROUP BY OrgCode, AgencyCode, FiscalYear, ObjectCode) AS Temp GROUP BY 1) AS Temp2 QUALIFY RANK(Occurs) = 1; /***** * NAME: ATM Statistics * PURPOSE: *****/ SELECT MAX(TheCount) AS MaxRecs, AVG(TheCount) AS AvgRecs, COUNT(*) AS DistinctVals FROM (SELECT COUNT(*) AS TheCount FROM budget GROUP BY OrgCode, AgencyCode, FiscalYear, ObjectCode) AS Temp;
The mode, max and number of distinct values are used in (what used to be called) the ATM process (gosh, I'm old). The max and mode should be (current rule of thumb) less than 100, and the number of distinct values high. Oh well, somebody else do the lecture on how Teradata works, I'm tired.
Frank C. Martinez IV
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|