Archives of the TeradataForum

Message Posted: Wed, 27 Nov 2002 @ 20:58:14 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: MEDIAN - using SQL

From:		Dieter N�th

Is there anyway we can get the median for a set of values using Teradata SQL? If so how?

The tyical stuff (look at Joe Celko's SQL For Smarties) proposed for medians is using scalar subqueries or cross joins, so it's impossible (scalar subqueries) or rather slow (cross joins) especially with larger data sets.

A modified version of Ulrich's query with easier syntax:

select
  dt1.med_group
  ,avg(med_value)
from
  (select
     med_group
     ,med_value
     ,sum(1) over (partition by med_group
                   order by med_value
                   rows unbounded preceding) as row_num
   from test) as dt1
join
   (select
      med_group
      ,count(*) as row_count
    from test
    group by med_group
   ) dt2
on dt1.med_group = dt2.med_group
where
  row_num = (row_count + 1) / 2
or
  row_num = (row_count / 2) + 1
group by dt1.med_group
order by dt1.med_group;

This is for financial median, for statistical median use only one of the where-conditions:

lower value: row_num = (row_count + 1) / 2
higher value: row_num = (row_count / 2) + 1

And another version, this may be faster, especially when test is a complex view instead of a simple table.

select
  dt1.med_group
  ,avg(med_value)
from
  (select
     med_group
     ,med_value
     ,sum(1) over (partition by med_group
                   order by med_value
                   rows unbounded preceding) as row_num
     ,count(*) over (partition by med_group
                     rows between unbounded preceding
                          and unbounded following) as row_count
   from test) as dt1
where
  row_num = (row_count + 1) / 2
or
  row_num = (row_count / 2) + 1
group by med_group
order by med_group;

Dieter


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference