Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 27 Feb 2004 @ 11:09:08 GMT


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  


Subj:   Standard Deviation formula
 
From:   TONGS

Hello,

I am confused by the Teradata functions of Standard Deviations. I even create sample table and data for verification, and results show that the Teradata version of Standard Deviation does not match up. However, I reserve my opinion on this due to the fact that I may need to convert the values of the data sample to "Float" or some other types. I attach the DDL statements at the end of my question, section 2. There was a similiar post before, but the author used simple data samples and so it looked correct. This time, I try to make it more complex.

There are three pictures in the Attachment area:

SD.jpg SD.jpg

STDDEV_POP.jpg STDDEV_POP.jpg

STDDEV_SAMP.jpg STDDEV_SAMP.jpg


SD.jpg explains the generic equation of Standard Deviation, while STDDEV_POP.jpg and STDDEV_SAMP.jpg are the equations Teradata use to compute for standard deviation. My question is, which equation in (STDDEV_POP.jpg, STDDEV_SAMP) is equivalent to the SD.jpg ? They looked different. Is Teradata making any assumptions in deriving its own standard deviation formula ? In SD.jpg, the variable "x" with a bar above is the MEAN of "x".

There are 2 sections below. One is the descriptions of what I learned about SD, and another is the DDL I used to create a sample table with values.


Section 1)

Here is what I learn about Standard Deviation (SD)

Terms to know

X = one value in your set of data

Avg (X) = the mean (average) of all values X in the set of data

N = the number of values X in the set of data

Y = Sum of ( Xi - Avg(X) )**2


SD: For each value X, subtract the overall Avg (X) from X, then multiply that result by itself (or the square of that value). Sum up all those squared values, called this Y. Then divide that result by N, and then Square Root of this result.

SD = Square Root ( Y/N )


Section 2)

Here are 6 data samples: (1, 2, 3, 4, 4.5, 2.8) . My SD formlua gives SQRT( 8.208334 / 6 ) = 1.1696. Both Teradata functions give value of "1". Surprise?

CREATE SET TABLE TONGTEST ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL
     (
      SAMPLE_DATA DECIMAL(18,2) NOT NULL,
      STATUS CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC
      )
PRIMARY INDEX ( STATUS );

insert into TONGTEST
VALUE (1, 'Y')

insert into TONGTEST
VALUE (2, 'Y')

insert into TONGTEST
VALUE (3, 'Y')

insert into TONGTEST
VALUE (4, 'Y')

insert into TONGTEST
VALUE (4.5, 'Y')

insert into PCDW_EBI_UAT.TONGTEST
VALUES (2.8, 'Y')

Finally,

SELECT STDDEV_SAMP(cast(SAMPLE_DATA as float))
FROM        PCDW_EBI_UAT.TONGTEST


     
  <Prev Next>  
<<First
<Prev
Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023