May 30, 2007 at 3:44 pm
We've got some student coders working with a contractor to mine our data, in order to generate reports. One thing they do a lot of is calculate the median value of a column within a table or a SELECT statement. The way they do that is to sort the data with a SELECT and then go through the sorted data looking for median, or calculating it if the number of items is even. I review their work to make certain that they're using the correct data, etc. Their practice is to use cursors in order to iterate through the resultset to calculate the median value for the column.
However, one of their last stored procedures that I am review, is really huge. If I were to print it out, it would be 133 pages long. I've counted the number of cursors that they are opening up and found there to be about 100 cursors in the SP.
I know that cursors tend to be resource hogs, and I try to avoid them whenever possible. I have to wonder how much of a resource hog this SP, with about 100 cursor operations is going to be. Which makes me wonder, is there any better way to calculate the median value of a table or SELECT statement, without having to resort to cursors?
Kindest Regards, Rod Connect with me on LinkedIn.
May 30, 2007 at 5:27 pm
I am assuming (bad habit) that the data in this case is numeric. This actually sounds like a good use for using CLR integration and write some procedures/functions in VB.NET or C# to compute median values.
May 30, 2007 at 5:33 pm
I'm not sure where I got this (or how helpful it will be) but the following sql computes a median without cursors. The application was, if I remebmer correctly, calculating the median processing delay between transactions on multiple registers in a POS application.
select median = avg(cast(processing_delay as float)), [#rowsIn Group] = counter
from (select cnt1 = counter/2 +1, cnt2 = CASE counter%2 when 0 then counter/2 else 0 end,
resister_no counter from (select counter = count(*), register_no from pos_delays group by register_no) t1) t2
inner join pos_delays t3 on t3.register_no = t2.register_no
where (select count(*) from pos_delays t4 where ((t3.processing_delay > t4.processing_delay)
or (t3.processing_delay = t4.processing_delay
and t3.transcation_no >= t4.transaction_no))
and t3.register_no = t4.register_no) in (cnt1,cnt2)
group by t2.register_no, counter
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply