September 30, 2007 at 7:28 am
Comments posted to this topic are about the item relatively simple mode, median, mean
January 26, 2009 at 3:48 pm
Good post, I needed some mode and median calculations in a short time frame, and your SQL was invaluable. Thanks.
April 3, 2009 at 12:03 pm
There's only one problem with the calculation of the median: if the row count of the sample data is even, then the median is not unique (see http://en.wikipedia.org/wiki/Median). Here's a version of the calc that accounts for even row counts:
;WITH
TopHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue ASC
)
,BottomHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue DESC
)
,BottomOfTopHalf AS (
SELECT TOP 1 DataValue
FROM TopHalf
ORDER BY DataValue DESC
)
,TopOfBottomHalf AS (
SELECT TOP 1 DataValue
FROM BottomHalf
ORDER BY DataValue ASC
)
SELECT
Median = (BottomOfTopHalf.DataValue + TopOfBottomHalf.DataValue) / 2.0
FROM BottomOfTopHalf, TopOfBottomHalf
September 9, 2009 at 10:04 am
I need something similar, but I need the daily median of a data set, effectively doing a group by on the Date.
Only way I've found to do this is with a cursor was wondering if any clever chaps out there have a better idea?
September 10, 2009 at 2:26 pm
leea (9/9/2009)
I need something similar, but I need the daily median of a data set, effectively doing a group by on the Date.Only way I've found to do this is with a cursor was wondering if any clever chaps out there have a better idea?
I'm guessing that you're running this against data that is historically static, correct? If so, how about creating a table to hold the results of your daily median calculation? That way you don't have to rerun it for historic data every time a user needs it.
June 10, 2010 at 1:12 pm
Very good catch Jesse!
Jesse McLain (4/3/2009)
There's only one problem with the calculation of the median: if the row count of the sample data is even, then the median is not unique (see http://en.wikipedia.org/wiki/Median). Here's a version of the calc that accounts for even row counts:
;WITH
TopHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue ASC
)
,BottomHalf AS (
SELECT TOP 50 PERCENT DataValue
FROM DataSet
ORDER BY DataValue DESC
)
,BottomOfTopHalf AS (
SELECT TOP 1 DataValue
FROM TopHalf
ORDER BY DataValue DESC
)
,TopOfBottomHalf AS (
SELECT TOP 1 DataValue
FROM BottomHalf
ORDER BY DataValue ASC
)
SELECT
Median = (BottomOfTopHalf.DataValue + TopOfBottomHalf.DataValue) / 2.0
FROM BottomOfTopHalf, TopOfBottomHalf
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy