May 5, 2009 at 7:20 am
Anyone know a more efficient way of doing this. Obviously my code scans the table three times. I imagine there is a more elegant way of doing this.
----create test data
SELECT TOP 1000000
IDENTITY(INT,1,1) AS N
INTO #numbers
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--- gets 75% and 25% values using parameterised top clauses
declare @max-2 int , @min-2 int
set @max-2 = 25
set @min-2 = 25
select min(n.n), max(n.n)
from #numbers n
left join
(
select n
from (
select top (@max) percent n from #numbers order by n desc
)x
union
select *
from (
select top (@min) percent n from #numbers order by n asc
)x
)y
on n.n = y.n
where y.n is null
drop table #numbers
thanks for all help
May 5, 2009 at 8:58 am
I just tried this:
set nocount on;
if object_id(N'tempdb..#T') is not null
drop table #T;
if object_id(N'tempdb..#T2') is not null
drop table #T2;
create table #T (
ID int identity primary key,
Val int);
insert into #T (Val)
select abs(checksum(newid()))
from dbo.Numbers;
declare @max-2 int, @min-2 int;
select @max-2 = 25, @min-2 = 25;
set statistics time on;
declare @Row1 int, @Row2 int, @Rows float;
select @Rows = count(*)
from #T;
select
@Row1 = @Rows / (100.00/@Max) + 1,
@Row2 = @Rows - @Rows / (100.00/@Min);
select row_number() over (order by Val) as Row, Val
into #T2
from #T;
select T21.Val as BottomVal, T22.Val as TopVal
from #T2 T21, #T2 T22
where T21.Row = @Row1
and T22.Row = @Row2;
set statistics time off;
print '======================================='
set statistics time on;
select min(n.val), max(n.val)
from #T n
left join
(
select n
from (
select top (@max) percent val as n from #T order by val desc
)x
union
select *
from (
select top (@min) percent val as n from #T order by val asc
)x
)y
on n.val = y.n
where y.n is null;
set statistics time off;
The first solution typically took about half the time of the second one on my machine.
Also tried it with IO stats turned on. First solution takes 2 scans, 48 reads on the main table, and 2 scans 54 reads on the secondary table. Second solution takes 3 scans, 72 reads total.
Expanded it out to 1-million rows (from 10-thousand). At that size, the first solution took 2672 milliseconds CPU time, 2335 total. Second solution took 12829 milliseconds CPU, 19636 total. First one took 8 scans, 9416 reads, second one took 9 scans, 6327 reads.
So, if you have a lot of rows, the first solution is probably going to be significantly faster, but the second one has slightly better IO stats (1 more scan, but only 2/3rds of the reads).
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply