August 9, 2007 at 12:39 pm
Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/jSebastian/3181.asp
.
August 24, 2007 at 10:15 pm
Hi all,
again, the source code listing does not show up well in IE. The lines are wrapped at places where they are not supposed to be.
The code shows up correctly in Mozilla.
If you are using IE, i would suggst to click on the link 'code' and read the source code from there.
.
August 28, 2007 at 11:24 pm
Jacob,
I've read seemingly a bazillion SQL Server pagination articles/tutorials, and I must say, I think that yours is head and shoulders above the rest. Very clearly worded, and more specifically, extremely well commented code. Thank you!
August 29, 2007 at 12:37 am
Good article, although every server based pagination sample I read seems to skip the fact that in almost every situation the "user" wants to know how many records there are in total, including any specified filters. This allows any client side paging mechanism to say "you're on page 3 of 15" etc.
Is there any way to do this without using a temp table/variable as this seems to be a much less elegant solution than the ROW_NUMBER paging shown here ?
August 29, 2007 at 12:48 am
Hi noggin,
I had found 3 points that I must add to the above article after playing with a few real-life assignments.
I have a follow-up article coming up, which shows some sample code which may be helpful to clear those 3 points.
thanks
Jacob
.
August 29, 2007 at 2:30 am
Jacob,
I read your article what I found very useful.
Now I have a question related to joins. Please consider this two examples:
WITH EmployeesOrders AS (
SELECT
Employees.EmployeeID,
Employees.FirstName,
Employees.LastName,
Orders.OrderDate,
Orders.ShippedDate,
ROW_NUMBER()OVER (ORDER BY LastName) AS RecNum
FROM Orders INNER JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID)
SELECT * FROM EmployeesOrders
WHERE RecNum BETWEEN ((@PageNumber - 1) * @PageSize) + 1 AND @PageNumber * @PageSize
ORDER BY RecNum
and
WITH EmployeesOrders AS (
SELECT
Employees.EmployeeID,
ROW_NUMBER()OVER (ORDER BY LastName) AS RecNum
FROM Employees)
SELECT
Employees.EmployeeID,
Employees.FirstName,
Employees.LastName,
Orders.OrderDate,
Orders.ShippedDate
FROM EmployeesOrders
INNER JOIN Employees ON EmployeesOrders.EmployeeID=Employees.EmployeeID
INNER JOIN Orders ON Orders.EmployeeID = Employees.EmployeeID
WHERE RecNum BETWEEN ((@PageNumber - 1) * @PageSize) + 1 AND @PageNumber * @PageSize
ORDER BY RecNum
In the first query all the join and select stuff is done in the CTE (joining maybe 100000 records from a foreign table)
but in the second query the CTE returns only the ID needed to perform the selecting and the joining.
In this case the joins will have to deal only with the number of records given from the @Pagesize (maybe 15 or 20) and I think that this would be much more efficient in a 100000 records scenario.
Having to implement this in my current project which option should I choose ?
Many thanks
Franco Pizzinini
August 29, 2007 at 2:53 am
I found this a really useful article since I hadn't played much with CTEs or ROW_NUMBER() before. Both of those concepts have suddenly 'clicked' and I really see the benefits of your approach. Just a few comments though, and apologies for being a bit picky:
Firstly, when you're applying the LastName, Title and City filters your LIKE clause starts with a wildcard - that's probably fine for small tables, but in large tables that's going to cause a table scan which will kill performance. If that's the application requirement then so be it, but for an example like yours I think you'd be better off removing the leading % so as not to encourage this practice. The whole idea of paging through result sets is to enable efficient IO and network bandwidth usage, particularly over large tables.
Secondly, several paging solutions I have seen do not scale at all well as the number or records in a table increases. Do you have any idea how well SQL Server optimizes the ROW_NUMBER function and how well your solution scales over big tables? I'll do a little investigation of my own when I get a chance, I've got some tables with several million records in which I'd be interested in doing some tests on.
August 29, 2007 at 4:11 am
Hi
I quite like the exposition of the rownumber and over stuff. Interesting.
What I would ask is whether the performance issues relating to the and(x is null or x = y) that were experienced in SQL2000 (ie knocking it down to a scan) have been resolved on 2005?
Cheers
August 29, 2007 at 4:42 am
Cheers Jacob, I look forward to the follow-up.
August 29, 2007 at 4:46 am
I think that you can limit the initial rows you get in the CTE if you order the query and use the TOP clause
;WITH emp AS (SELECT TOP (@PageNumber * @PageSize)
/* retrieve only a limited result set
if you need records 31 -> 60, you retrieve in the first step only 60 rows
*/
CASE
WHEN @SortOrder = 'Title' THEN ROW_NUMBER()OVER (ORDER BY Title)
WHEN @SortOrder = 'HireDate' THEN ROW_NUMBER()OVER (ORDER BY HireDate)
WHEN @SortOrder = 'City' THEN ROW_NUMBER()OVER (ORDER BY City)
-- In all other cases, assume that @SortOrder = 'LastName'
ELSE ROW_NUMBER()OVER (ORDER BY LastName)
END AS RecID,
LastName,
FirstName,
Title,
HireDate,
City,
Country,
PostalCode
FROM employees
WHERE
(@LastName IS NULL OR LastName LIKE '%' + @LastName + '%')
AND
(@Title IS NULL OR Title LIKE '%' + @Title + '%')
AND
(@City IS NULL OR City LIKE '%' + @City + '%')
/*specify order */
ORDER BY
CASE
WHEN @SortOrder = 'Title' THEN Title
WHEN @SortOrder = 'HireDate' THEN HireDate
WHEN @SortOrder = 'City' THEN City
ELSE LastName
END
)
SELECT
RecID, -- This column is only for debugging/testing purpose.
LastName,
Title,
HireDate,
City
FROM emp
/*
you need only the last @PageSize rows from your result set
*/
WHERE RecID > (@PageNumber - 1) * @PageSize)
ORDER BY RecID
August 29, 2007 at 5:26 am
The leading wild card could be expensive as noted above. I'd also worry about users actually paging through the data causing a possibly expensive query to be run over and over. I've seen solutions that optimized for the first page only as it's obviously always used, and solutions that pushed primary keys, rownumber, and session id into a permanent table to make second and later pages faster to retrieve. All tradeoffs of course.
August 29, 2007 at 6:26 am
I thought this was an excellent article with a simple solution to a complex problem. There are definitely possibilities for performance imporvement, but the solution supplied to the problem was easy to understand and I could implement it immediately if needed. Many times articles are not as clear and concise as this one.
As far as performance enhancements, well, if there weren't any suggestions made by folks visiting this site, the site wouldn't be as good a resource as it is.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 29, 2007 at 6:53 am
Franco,
I was just looking at a similar trade-off. I found that for my needs, the answer depends (big surprise ).
It depends on what columns you are using to execute your row_number() over and what indexes you have.
If you are doing row_number() over the column(s) that your table is clustered on, then you should retrieve all of the necessary columns from that table within the definition of the CTE. Otherwise you are grabbing the IDs in one pass and then going back to get the associated columns from those rows. At least this is the plan I saw on my machine.
However if you have a covering nonclustered index for row_number() and you're going to have to do bookmark lookups anyway, I believe the best approach is to just return the columns needed for the lookup from within the CTE and then join against it from outside the CTE.
It doesn't hurt to double check the plan on your machine, but that is the conclusion I came to.
Hope that helps!
-Adrian
August 29, 2007 at 7:01 am
I know it is just a matter of personal preference, but instead of parameters like @PageSize and @PageNumber I use @FirstItem and @NumItems. All of the array operations in c# use a beginIndex/length style so I prefer to stay similar to that in my data access layer, and leave the notion of "Pages" to the presentation layer.
August 29, 2007 at 7:07 am
Nice article. I particularly liked the showcasing of the usefullness of the new ROW_NUMBER() function in SQL Server 2005. I typically stick to developing on the SQL Server side and don't do much on the web side, but I have seen some really narly solutions for accomplishing the sort order with pagination on SQL Server 2000 which involved a ton of dynamic SQL. It would be really interesting to see what it would take to duplicate this elegant stored procedure without the use of the ROW_NUMBER() over (order by x) function.
Anyway, thanks for the article.
Dan
Viewing 15 posts - 1 through 15 (of 61 total)
You must be logged in to reply to this topic. Login to reply