SELECT certain number of rows

  • If I have a select statement and I want the first top record, I can do

    SELECT top 10 a, b, c

    FROM tablea

    ORDER By a

    But if I want the 21st to 30th record, how do I do it? There are two ways I can think of, the first one is using ROW_NUMBER() OVER and the other is INSERT into a table with identity field.

    CREATE TABLE #temp(tempid INT IDENTITY(1,1),

    a VARCHAR,

    b VARCHAR,

    c VARCHAR)

    INSERT INTO #temp (a, b, c)

    SELECT a, b, c

    FROM tablea

    ORDER by a

    Then

    SELECT *

    FROM #temp

    WHERE tempid BETWEEN 21 and 30

    Is there any other way?

    Thanks

  • It sounds like a paging problem.

    One, of many, solutions that Itzik Ben-Gan offers in TSQL Querying:

    --sets up data this way:

    SELECT ROW_NUMBER() OVER(ORDER BY qty,empid) AS rownum,

    empid,mgrid,qty

    INTO #SALESRN

    FROM dbo.Table;

    CREATE UNIQUE CLUSTERED INDEX idx_rn ON #SalesRN(rownum);

    DECLARE @pagesize INT, @pagenum INT;

    SET @pagesize = 5;

    SET @pagenum = 2;

    SELECT rownum,empid,mgrid,qty

    FROM #SalesRN

    WHERE rownum BETWEEN @pagesize * (@pagenum-1) +1

    AND @pagesize * @pagenum

    ORDER BY rownum;

    He has a bunch of other examples. You'd have to read through the book to see them all. Do a search in the scripts locally to see what kind of paging solutions there are out there. Something will work for you I'm pretty sure.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I would go with the Row_number option.

  • Grant that solution you posted is almost identical to the one I use :). I use a different formula to calculate page number, but the same principals exist.

    SET @pgNbr = 2

    SET @pgSize = 50

    SET @min_Rec = (@pgSize * @pgNbr) - (@pgSize -1)

    SET @max_Rec = @pgSize * @pgNbr

    SELECT *

    FROM(

    SELECT ROW_NUMBER() OVER(ORDER BY id) AS [Row_ID],*

    FROM MyTable

    ) AS a

    WHERE a.Row_ID >= @min_Rec AND a.Row_ID <= @max_Rec

  • Grant Fritchey (3/12/2008)


    It sounds like a paging problem.

    One, of many, solutions that Itzik Ben-Gan offers in TSQL Querying:

    I wish great authors would warn people of the code the create. If 1000 people are all hitting the server at the same time with that paging code, and the result set being put in the first query is 300 million rows, what do you suppose that's going to do to TempDB and the server?

    The method shown is alright, I suppose, but they never warn of the dangers of things if you use it at face value.

    Besides, you don't need a bloody temp table to do this and I'm a bit surprised that such a great author would have the nads to publish such code. You sure it wasn't changed somewhere along the line?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It was just one of the solutions he offered. It was only the second solution and it ran faster than his initial solution. He did spend a lot of time in the book talking about performance trade-offs, etc. His final set of solutions involved a bunch of code that I just didn't feel like typing out (sue me). Here's his "next page" query from late in the book, no temp tables, as you correctly point out:

    CREATE PROC dbo.usp_nextpage

    @anchor AS INT, --key of last row in prev page

    @n as INT = 10 -- number of pages

    AS

    SELECT TOP(@n) O.OrderId, O.OrderDate,O.CustomerId,O.EmployeeId

    FROM dbo.Orders AS O

    JOIN dbo.Orders AS A

    ON A.OrderId = @anchor

    AND (O.OrderDate >= A.OrderDate

    AND (O.OrderDate > A.OrderDate

    OR O.OrderId > A.OrderId))

    ORDER By O.OrderDate, O.OrderId

    This code is dependent on an index on orderid & orderdate and includes customerid & employeeid to make it covering. I don't think that's always giong to be possible. Also, you can't assume in all circumstances that a column like OrderId is available to give you ordered data and therefore a repeatable anchor for paging.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Heh... LOL. I guess using TOP to overcome a double triangular join will do. 😀 I'll add it to my list of paging methods to test for performance. Thanks, Grant.

    And, no, please understand that I wasn't taking a pot shot at you...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Course not. I wasn't worried about that. But I was being lazy & you caught me on that.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • So Jeff, what do you think which solution is the better solution and do you have a better one yourself?

    You are right, the problem with using ROW_NUMBER slows everything down.

  • So Jeff, what do you think which solution is the better solution and do you have a better one yourself?

    You are right, the problem with using ROW_NUMBER slows everything down.

    This is not as absolute as you may think. I was able to process a request using a table of 1,000,000 rows in 1 second and LESS, once the execution plan was cached and the proper indexes were in place.

    Test Table (complements of Jeff 🙂 )

    --===== Create and populate test table.

    -- Column "RowNum" contains unique row numbers

    -- Column "SomeID" contains non-unique numbers

    -- (should be 1/400th of the row count to allow about 400 concatenated items per SomeInt).

    -- Column "SomeCode" has a range of "AA" to "ZZ" non-unique 2 character strings

    SELECT TOP 1000000 --<<<LOOK! CHANGE THIS NUMBER TO CHANGE THE NUMBER OF ROWS!

    RowNum = IDENTITY(INT,1,1),

    SomeID = ABS(CHECKSUM(NEWID()))%2500+1, --<<<LOOK! CHANGE THIS NUMBER TO 1/400th THE ROW COUNT

    SomeCode = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    INTO dbo.TestData

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.TestData

    ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Add the index to support both types of queries

    CREATE INDEX IXC_TestData_Cover1 ON dbo.TestData(SomeID,SomeCode)

    Now the paging logic

    DECLARE @pgNbr INT,

    @min_Rec INT,

    @max_Rec INT,

    @pgSize INT

    SET @pgNbr = 10

    SET @pgSize = 1000

    SET @min_Rec = (@pgSize * @pgNbr) - (@pgSize -1)

    SET @max_Rec = @pgSize * @pgNbr

    SELECT *

    FROM(

    SELECT ROW_NUMBER() OVER(ORDER BY Someid) AS [Row_ID],

    SomeID,

    SomeCode

    FROM TestData

    ) AS a

    WHERE a.Row_ID >= @min_Rec AND a.Row_ID <= @max_Rec

    --================================================

    -- OUTPUT OF STATS

    --================================================

    /*

    (50 row(s) affected)

    Table 'TestData'. Scan count 1, logical reads 10, physical reads 0,

    read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 385 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    */

    You will note there is a direct relationship between the page number you specify and the IO. The higher the page number, the more logical reads will occur. I am not saying that this is the best solution all the time. I am saying this is a very efficient solution, if implemented correctly.

  • Every time I hear "paging", I think "range scan"....which leads me to clustered index.

    If you do have to be doing that amount of paging, that might be the time to start thinking about whether the clustered index should be based on whatever is the "paging function".

    Also - assuming you only use one "size of paging", and that the order is set somehow - you could just make it some kind of derived/calculated column. Might not be "perfect" at all times (i.e. you might have a few extra or a few missing from the "standard page count" on a given page by the end of the day), but it does beat the tar out of dynamically generating rownumbers or huge temp tables for 300 separate sessions....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt,

    My query is not involved one table, it joins 4 or 5 tables, so how am I going to put a cluster index?

  • Loner (3/13/2008)


    Matt,

    My query is not involved one table, it joins 4 or 5 tables, so how am I going to put a cluster index?

    Well - it was a cute idea while it lasted....Didn't catch that part....ugh. That does make it quite a bit harder.

    perhaps generate a "paging table" that "remembers the paging for the day" (meaning, with just the pk's from the other tables). That's starting to stretch it a little (might be too much work to be worth it). Meaning - make it NOT a temp table, because if you're paging on something as big as was mentioned - you don't want 400 copies of very large temp tables.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • How about an indexed view? Possibly a sorted index view. Yeah, I know, might require a TOP with an ORDER BY in a view which is normally a Bozo-No-No... but, might make for some nasty fast paging if you stuck a ROW_Number in there somewhere...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The indexed view might work...as long as it's not needed to be updatable (since we're now talking multi-table view and all). Especially if you manage to "precalculate" the paging (and make that the leading eadge of the index) like I mentioned earlier.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply