September 30, 2007 at 7:28 am
Comments posted to this topic are about the item relatively simple mode, median, mean
January 26, 2009 at 3:48 pm
Good post, I needed some mode and median calculations in a short time frame, and your SQL was invaluable. Thanks.
April 3, 2009 at 12:03 pm
There's only one problem with the calculation of the median: if the row count of the sample data is even, then the median is not unique (see http://en.wikipedia.org/wiki/Median). Here's a version of the calc that accounts for even row counts:
;WITH
TopHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue ASC
)
,BottomHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue DESC
)
,BottomOfTopHalf AS (
SELECT TOP 1 DataValue
FROM TopHalf
ORDER BY DataValue DESC
)
,TopOfBottomHalf AS (
SELECT TOP 1 DataValue
FROM BottomHalf
ORDER BY DataValue ASC
)
SELECT
Median = (BottomOfTopHalf.DataValue + TopOfBottomHalf.DataValue) / 2.0
FROM BottomOfTopHalf, TopOfBottomHalf
September 9, 2009 at 10:04 am
I need something similar, but I need the daily median of a data set, effectively doing a group by on the Date.
Only way I've found to do this is with a cursor was wondering if any clever chaps out there have a better idea?
September 10, 2009 at 2:26 pm
leea (9/9/2009)
I need something similar, but I need the daily median of a data set, effectively doing a group by on the Date.Only way I've found to do this is with a cursor was wondering if any clever chaps out there have a better idea?
I'm guessing that you're running this against data that is historically static, correct? If so, how about creating a table to hold the results of your daily median calculation? That way you don't have to rerun it for historic data every time a user needs it.
June 10, 2010 at 1:12 pm
Very good catch Jesse!
Jesse McLain (4/3/2009)
There's only one problem with the calculation of the median: if the row count of the sample data is even, then the median is not unique (see http://en.wikipedia.org/wiki/Median). Here's a version of the calc that accounts for even row counts:
;WITH
TopHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue ASC
)
,BottomHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue DESC
)
,BottomOfTopHalf AS (
SELECT TOP 1 DataValue
FROM TopHalf
ORDER BY DataValue DESC
)
,TopOfBottomHalf AS (
SELECT TOP 1 DataValue
FROM BottomHalf
ORDER BY DataValue ASC
)
SELECT
Median = (BottomOfTopHalf.DataValue + TopOfBottomHalf.DataValue) / 2.0
FROM BottomOfTopHalf, TopOfBottomHalf
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply