Performace issue using paging feature with ROW_NUMBER function

  • Hello:

    I am finding an issue with paging using ROW_NUMBER function.

    I have the following query

    SELECT Row

    ,Id

    ,BN

    ,createTimestamp

    FROM (SELECT ROW_NUMBER() OVER (order by table1.createTimestamp DESC)

    AS Row

    ,table1.Id

    ,XML.value('(//BillNumber)[1]', 'varchar(10)')

    as BN

    ,table1.createTimestamp

    FROM table1 with(nolock)

    Where XML.value('(//BillNumber)[1]', 'varchar(10)') like

    '%123%') AS

    table1WithRowNumbers

    WHERE Row >= 1

    AND Row <= 500

    order by Row;

    The inner SELECT is much complicated, so I simplified for understandig purpose. Above results in 500 rows starting 1 to 500. The inner select results in some where 10000 rows and I just want to select first 500. This takes around 3 seconds.

    Now I want to go to the last page containing the last 500.

    SELECT Row

    ,Id

    ,BN

    ,createTimestamp

    FROM (SELECT ROW_NUMBER() OVER (order by table1.createTimestamp DESC)

    AS Row

    ,table1.Id

    ,XML.value('(//BillNumber)[1]', 'varchar(10)')

    as BN

    ,table1.createTimestamp

    FROM table1 with(nolock)

    Where XML.value('(//BillNumber)[1]', 'varchar(10)') like

    '%123%') AS

    table1WithRowNumbers

    WHERE Row >= 9501

    AND Row <= 10000

    order by Row;

    when I try the above query it is taing almost 18 seconds ( x6 times the first query)

    My understanding is that the pagination should work the same way and should take same time whether I choose first 500 records or last 500 records. Is there something that I am missing that would help me achieve the same.

    I have simplified my scenario as much as possible. Any suggestions are appreciated.

  • I won't try to dissect the paging stuff. There are smarter guys than I am around here who can do that. But, I will point this out:

    Where XML.value('(//BillNumber)[1]', 'varchar(10)') like

    '%123%')That's going to kill performance. You have no choice but to do scans against data with this in the WHERE clause. It'd be worth your time to pull that data out and put it into another column as you load it. That way you'll have something you can index. But, even if you do that, '%123%' will still result in a scan. If you have to use a like clause, it needs to be one that can take advantage of indexes, '123%.' If your data doesn't support that, I'd work on it because otherwise, the performance will be bad and get worse as you add more and more data to the table.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/30/2010)


    I won't try to dissect the paging stuff. There are smarter guys than I am around here who can do that. But, I will point this out:

    Where XML.value('(//BillNumber)[1]', 'varchar(10)') like

    '%123%')That's going to kill performance. You have no choice but to do scans against data with this in the WHERE clause. It'd be worth your time to pull that data out and put it into another column as you load it. That way you'll have something you can index. But, even if you do that, '%123%' will still result in a scan. If you have to use a like clause, it needs to be one that can take advantage of indexes, '123%.' If your data doesn't support that, I'd work on it because otherwise, the performance will be bad and get worse as you add more and more data to the table.

    Thanks Grant Fritchey:

    I understand the implications of XML.value. as you pointed out we are working on adding the new columns to populate the data. But my question still remains, having done all the scanning, why my query is behaving response time wise differently for fetching the first set of 500 rows against fetching the last 500 rows.

  • As Grant would tell you, it's hard to know for sure without looking at an execution plan. Given an index on [createTimestamp ], the optimizer may be smart enough to just get the first 500 rows that satisfy your WHERE clause testing the BillNumber. If so, the query time would get longer as you paged upwards, because more rows would have to be scanned to get to the appropriate row numbers.

    You might try this to pay the price of scanning the entire table once, and then make paging much faster.

    1) Create a #temp table to hold your output. Make sure the primary key (clustered index) is the row number you are creating.

    2) Populate it from your query above, without the WHERE clause.

    3) Select * from #temp where row between 9501 and 10000 (or 1 and 500)

    You should get an index seek of 500 rows every time. If a #temp table doesn't work for you, you can always make it a permanent "paging" table, and add a "session ID" column to make a compound key with the row

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply