December 13, 2018 at 12:24 pm
Question: is PERCENTILE_DISC for median calculation works for both odd and even number of records?
In SQL Server 2008 to calculate median we used this code based on modulo (@modulo was calculated before this)
case @modulo
when 1 then
(SELECT MAX([TimeInDepartmentMinutes]) FROM
(SELECT TOP 50 PERCENT [TimeInDepartmentMinutes] FROM dbo.vw_ED_Time_Calc
where ArrivalDateTime >= @StartDate and ArrivalDateTime <= @EndDate and Psyc ='yes' and HospitalName = @Hospital
ORDER BY [TimeInDepartmentMinutes]) AS Median)
else -- even
(SELECT
( (SELECT MAX([TimeInDepartmentMinutes]) FROM
(SELECT TOP 50 PERCENT [TimeInDepartmentMinutes]
FROM dbo.vw_ED_Time_Calc where ArrivalDateTime >= @StartDate and ArrivalDateTime <= @EndDate and Psyc ='yes' and HospitalName = @Hospital
ORDER BY 1--==>
) AS m1)
+ (
SELECT MIN([TimeInDepartmentMinutes]) FROM
(SELECT TOP 50 PERCENT [TimeInDepartmentMinutes] FROM dbo.vw_ED_Time_Calc
where ArrivalDateTime >= @StartDate and ArrivalDateTime <= @EndDate and Psyc ='yes' and HospitalName = @Hospital
ORDER BY 1 DESC --==>
) AS m2)
) / 2 AS Me)
end AS Median_LOS_All_Psyc_Patients
December 17, 2018 at 2:36 pm
That depends on your desired value. The median value computed by PERCENTILE_DISC has to actually come from the data, so the number of values is irrelevant in that regard only. After all, if you only have two values, which one is the median? Thus there is the potential to have a meaningless result unless you also count the number of rows from which that value has been selected as the median. PERCENTILE_CONT will compute the actual median value, which will not necessarily be represented in the data. Thus the meaning of _DISC is "discrete", and of _CONT is "continuous". Does that help answer the question?
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 17, 2018 at 5:20 pm
The OP actually made the mistake of posting their question in two separate posts. This question has been beat to death in the other post, which is located at the following URL...
https://www.sqlservercentral.com/Forums/2013502/PERCENTILEDISC-for-median-calculation
--Jeff Moden
Change is inevitable... Change for the better is not.
December 18, 2018 at 8:41 am
sgmunson - Monday, December 17, 2018 2:36 PMThat depends on your desired value. The median value computed by PERCENTILE_DISC has to actually come from the data, so the number of values is irrelevant in that regard only. After all, if you only have two values, which one is the median? Thus there is the potential to have a meaningless result unless you also count the number of rows from which that value has been selected as the median. PERCENTILE_CONT will compute the actual median value, which will not necessarily be represented in the data. Thus the meaning of _DISC is "discrete", and of _CONT is "continuous". Does that help answer the question?
Thank you very much.. So we should use PERCENTILE_CONT for median.
We discussed it with Jeff. I posted the question to this forum first. After a couple of days with no response I posted it on 2008.
Should I remove my post?
December 18, 2018 at 10:19 am
valeryk2000 - Tuesday, December 18, 2018 8:41 AMsgmunson - Monday, December 17, 2018 2:36 PMThat depends on your desired value. The median value computed by PERCENTILE_DISC has to actually come from the data, so the number of values is irrelevant in that regard only. After all, if you only have two values, which one is the median? Thus there is the potential to have a meaningless result unless you also count the number of rows from which that value has been selected as the median. PERCENTILE_CONT will compute the actual median value, which will not necessarily be represented in the data. Thus the meaning of _DISC is "discrete", and of _CONT is "continuous". Does that help answer the question?Thank you very much.. So we should use PERCENTILE_CONT for median.
We discussed it with Jeff. I posted the question to this forum first. After a couple of days with no response I posted it on 2008.
Should I remove my post?
No. You're alright. Guess we can't blame someone when they don't get a response after a couple of days and some people go nuts when someone bumps their post. I also made the mistake of not verifying the original posting date of the two posts before I called them out as a "duplicate".
--Jeff Moden
Change is inevitable... Change for the better is not.
December 18, 2018 at 10:23 am
For anyone interested in the conversation on the other post, here's the link...
https://www.sqlservercentral.com/Forums/2013502/PERCENTILEDISC-for-median-calculation
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply