Archives of the TeradataForum
Message Posted: Wed, 27 Nov 2002 @ 20:58:14 GMT
Subj: | | Re: MEDIAN - using SQL |
|
From: | | Dieter N�th |
| Is there anyway we can get the median for a set of values using Teradata SQL? If so how? | |
The tyical stuff (look at Joe Celko's SQL For Smarties) proposed for medians is using scalar subqueries or cross joins, so it's
impossible (scalar subqueries) or rather slow (cross joins) especially with larger data sets.
A modified version of Ulrich's query with easier syntax:
select
dt1.med_group
,avg(med_value)
from
(select
med_group
,med_value
,sum(1) over (partition by med_group
order by med_value
rows unbounded preceding) as row_num
from test) as dt1
join
(select
med_group
,count(*) as row_count
from test
group by med_group
) dt2
on dt1.med_group = dt2.med_group
where
row_num = (row_count + 1) / 2
or
row_num = (row_count / 2) + 1
group by dt1.med_group
order by dt1.med_group;
This is for financial median, for statistical median use only one of the where-conditions:
lower value: row_num = (row_count + 1) / 2
higher value: row_num = (row_count / 2) + 1
And another version, this may be faster, especially when test is a complex view instead of a simple table.
select
dt1.med_group
,avg(med_value)
from
(select
med_group
,med_value
,sum(1) over (partition by med_group
order by med_value
rows unbounded preceding) as row_num
,count(*) over (partition by med_group
rows between unbounded preceding
and unbounded following) as row_count
from test) as dt1
where
row_num = (row_count + 1) / 2
or
row_num = (row_count / 2) + 1
group by med_group
order by med_group;
Dieter
|