need better way than row_number()

  • I need to order 2 columns on a 10m+ row table with no indexes, and the query I have runs about 8 mins, how can I speed it up. thanks.

    create table tmpT

    (

    model char(2),

    date1 datetime,

    date2 datetime,

    date3 datetime

    )

    insert into tmpT

    select 'm1', getdate() - 7, getdate() - 5, getdate() -4 union all

    select 'm1', getdate() - 6, getdate() - 5 , getdate() -4 union all

    select 'm1',getdate() - 5, getdate() - 5 , getdate() -4 union all

    select 'm1',getdate() - 4, getdate() - 5 , getdate() -4 union all

    select 'm1',getdate() - 3, getdate() - 5 , getdate() -4 union all

    select 'm1',getdate() - 2, getdate() - 5 , getdate() -4 union all

    select 'm1',getdate() - 1, getdate() - 5 , getdate()-4 union all

    select 'm2', getdate() - 7, getdate() - 5, getdate() -4 union all

    select 'm2', getdate() - 6, getdate() - 5 , getdate() -4 union all

    select 'm2',getdate() - 5, getdate() - 5 , getdate() -4 union all

    select 'm2',getdate() - 4, getdate() - 5 , getdate() -4 union all

    select 'm2',getdate() - 3, getdate() - 5 , getdate() -4

    select * from tmpT

    select

    t.model,

    t.date1,

    t.date2,

    t.date3,

    case when t.date1 < t.date2

    then 0

    when t.date1 = t.date2

    then 1

    else row_1

    end as c1,

    case when t.date1 < t.date3

    then 0

    when t.date1 = t.date3

    then 1

    else row_2

    end as c2

    from tmpT t left join (

    select row_number() over (partition by model order by date1) + 1 as row_1, * from tmpT where date1 > date2

    ) t1

    on t.model = t1.model and t.date1 = t1.date1

    left join (

    select row_number() over (partition by model order by date1) + 1 as row_2, * from tmpT where date1 > date3

    ) t2

    on t.model = t2.model and t.date1 = t2.date1

  • Create a clustered index on (model, date1) ?

    That alone would likely have a massive performance impact.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Perhaps this isn't the actual SQL you're using? Your 't1' and 't2' sub-queries don't contain a 'Model' column - and neither alias is used in the rest of the query. As it stands, it appears the two left-joins can be removed. Additionally, the 'row_1' and 'row_2' columns referred to in the CASE statements are not defined anywhere in the query. Or, am I missing something?

  • You're missing the else's in the case statements that point to the rownumbers in the left joins. You can actually copy/paste/run that in SSMS to see what it does(although the 'why' is very much still a mystery to me as well).

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • shanghaigirl-1125377 (11/25/2009)


    I need to order 2 columns on a 10m+ row table with no indexes, and the query I have runs about 8 mins, how can I speed it up. thanks.

    If it needs an index, either add one yourself or get the DBA to do it. Just make sure the query is "pat" before you do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 1) you have no where clause - do you REALLY need to return all 10+M rows?

    2) can you please provide expected output? I can't for the life of me figure out why you have multiple left joins here for a simple order by need...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • thanks for all your replies, and sorry for my late response, I was away for a few days.

    the left join query gave me the exact result that I wanted. It needs to compare 2 dates partitioned by each model, if date1 is earlier than date2 then assign 0, otherwise start ordering 1,2,3... same as date3. yes, real query has to order 10m records. I have added an index model + date1, it worked much faster, now is 1 min 20sec.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply