February 14, 2008 at 5:48 am
Hi,
I have some data which includes a percentage column, where the sum of the percentages adds up to 100.
I want to identify the records with the largest percentages, in such a way that when sorted by percentage descending, the total percent equals (or goes over) 80%.
So I have a dataset that looks like this:
RecIDPCIsTop80
rec16.75
rec24.25
rec31.27
rec49.31
rec57.02
rec64.48
rec78.96
rec83.25
rec97.21
rec107.56
rec118.46
rec123.95
rec133.86
rec146.38
rec152.18
rec161.22
rec174.42
rec189.47
And the results I want look like this:
RecIDPCIsTop80
rec189.471
rec49.311
rec78.961
rec118.461
rec107.561
rec97.211
rec57.021
rec16.751
rec146.381
rec64.481
rec174.421
rec24.250
rec123.950
rec133.860
rec83.250
rec152.180
rec31.270
rec161.220
So when it gets to rec17, the total percentage is 80.02, at which point I want to stop flagging them.
Any ideas how I might do this without a cursor? The real datasets have thousands of rows, and there are thousands of datasets to do this on.
Thanks in advance do your help!
February 14, 2008 at 8:10 am
To get the hits, try
select a.RecID
from sums a, sums b
where b.PC >=a.PC
group by a.RecID
having sum(b.PC) >= 80.0
February 14, 2008 at 8:30 am
oops, sorry . upside down
February 14, 2008 at 8:59 am
Great, seems to do the trick.
Thanks for your help!
February 14, 2008 at 9:28 am
select a.RecID , a.PC ,
case when 100-sum(case when b.PC >a.PC then b.PC end) <20.0 then 0 else 1 end
from sums a, sums b
group by a.RecID, a.PC
rec16 1.22 0
rec3 1.27 0
rec15 2.18 0
rec8 3.25 0
rec13 3.86 0
rec12 3.95 0
rec2 4.25 0
rec17 4.42 1
rec6 4.48 1
rec14 6.38 1
rec1 6.75 1
rec5 7.02 1
rec9 7.21 1
rec10 7.56 1
rec11 8.46 1
rec7 8.96 1
rec4 9.31 1
rec18 9.47 1
February 14, 2008 at 11:48 am
I'll try that one out also, the first method is very slow on my large dataset.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply