Percentages performance

  • Anyone know a more efficient way of doing this. Obviously my code scans the table three times. I imagine there is a more elegant way of doing this.

    ----create test data

    SELECT TOP 1000000

    IDENTITY(INT,1,1) AS N

    INTO #numbers

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --- gets 75% and 25% values using parameterised top clauses

    declare @max-2 int , @min-2 int

    set @max-2 = 25

    set @min-2 = 25

    select min(n.n), max(n.n)

    from #numbers n

    left join

    (

    select n

    from (

    select top (@max) percent n from #numbers order by n desc

    )x

    union

    select *

    from (

    select top (@min) percent n from #numbers order by n asc

    )x

    )y

    on n.n = y.n

    where y.n is null

    drop table #numbers

    thanks for all help

    www.sql-library.com[/url]

  • I just tried this:

    set nocount on;

    if object_id(N'tempdb..#T') is not null

    drop table #T;

    if object_id(N'tempdb..#T2') is not null

    drop table #T2;

    create table #T (

    ID int identity primary key,

    Val int);

    insert into #T (Val)

    select abs(checksum(newid()))

    from dbo.Numbers;

    declare @max-2 int, @min-2 int;

    select @max-2 = 25, @min-2 = 25;

    set statistics time on;

    declare @Row1 int, @Row2 int, @Rows float;

    select @Rows = count(*)

    from #T;

    select

    @Row1 = @Rows / (100.00/@Max) + 1,

    @Row2 = @Rows - @Rows / (100.00/@Min);

    select row_number() over (order by Val) as Row, Val

    into #T2

    from #T;

    select T21.Val as BottomVal, T22.Val as TopVal

    from #T2 T21, #T2 T22

    where T21.Row = @Row1

    and T22.Row = @Row2;

    set statistics time off;

    print '======================================='

    set statistics time on;

    select min(n.val), max(n.val)

    from #T n

    left join

    (

    select n

    from (

    select top (@max) percent val as n from #T order by val desc

    )x

    union

    select *

    from (

    select top (@min) percent val as n from #T order by val asc

    )x

    )y

    on n.val = y.n

    where y.n is null;

    set statistics time off;

    The first solution typically took about half the time of the second one on my machine.

    Also tried it with IO stats turned on. First solution takes 2 scans, 48 reads on the main table, and 2 scans 54 reads on the secondary table. Second solution takes 3 scans, 72 reads total.

    Expanded it out to 1-million rows (from 10-thousand). At that size, the first solution took 2672 milliseconds CPU time, 2335 total. Second solution took 12829 milliseconds CPU, 19636 total. First one took 8 scans, 9416 reads, second one took 9 scans, 6327 reads.

    So, if you have a lot of rows, the first solution is probably going to be significantly faster, but the second one has slightly better IO stats (1 more scan, but only 2/3rds of the reads).

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply