In SQL server 2000, when a stored procedure needs to return
a number of records based on client’s request, we usually use SET ROWCOUNT @RequestedRecordNumber.
Though TOP operator can also limit the rows to be returned, we have to use dynamic
TSQL because it does not support expressions. SET ROWCOUNT 0 will return all
qualified rows, e.g.:
USE northwind GO DECLARE @RequestedRecordNumber int SET @RequestedRecordNumber=2 SET ROWCOUNT @RequestedRecordNumber SELECT * FROM dbo.[Order Details] WHERE OrderID=10248 ORDER BY ProductID SET ROWCOUNT 0 GO
However, when SET ROWCOUNT is applied on a TVF, it may give unexpected results. Let’s first define a table and populate some test data.
USE tempdb GO IF NOT EXISTS(SELECT 1 FROM sysobjects WHERE ID=OBJECT_ID('dbo.t_Emps')) BEGIN CREATE TABLE dbo.t_Emps (EmpID int NOT NULL, EmpName nvarchar(50) NOT NULL, ManagerID int NULL CONSTRAINT PK_t_Emps PRIMARY KEY CLUSTERED (EmpID)) END GO SET NOCOUNT ON GO INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (1,'E1', NULL) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (2,'E2', 1) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (3,'E3', 2) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (4,'E4', 3) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (5,'E5', 4) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (6,'E6', 4) INSERT dbo.t_Emps (EmpID,EmpName,ManagerID) VALUES (7,'E7', 4) GO
The following TVF is to return all directly managed employees by a manager.
IF EXISTS(SELECT 1 FROM sysobjects WHERE ID = OBJECT_ID('dbo.fn_GetChildren')) DROP FUNCTION dbo.fn_GetChildren GO CREATE FUNCTION dbo.fn_GetChildren (@EmpID int) RETURNS @Children TABLE (EmpID int NOT NULL) AS BEGIN INSERT @Children (EmpID) SELECT EmpID FROM dbo.t_Emps WHERE @EmpID IS NOT NULL AND ManagerID=@EmpID OR @EmpID IS NULL RETURN END GO
The following query returns all the employee managed by EmpID=4:
SELECT * FROM dbo.fn_GetChildren(4) A
As expected, it returns 5,6,and 7.
Query 1: use TOP to query first two employees managed by
EmpID=4
SET ROWCOUNT 0 SELECT top 2 * FROM dbo.fn_GetChildren(4) ORDER BY EmpID DESC GO
It returns 2 rows with EmpID=7 and 6.
Query 2: use SET ROWCOUNT to query first two employees managed
by EmpID=4
SET ROWCOUNT 2 SELECT * FROM dbo. fn_GetChildren(4) ORDER BY EmpID DESC SET ROWCOUNT 0 GO
It returns 2 rows with EmpID=6 and 5.
Check the execution plan, the plan for query 1 shows that all 3 rows are returned from the TVF. However the TVF in query 2 only returns 2 rows.
Execution pan for query 1
Execution pan for query 2
We can see that though SET ROWCOUNT is not allowed in a UDF,
the UDF can inherit the SET ROWCOUNT setting from its caller and affect the
result of SELECT statements in the UDF. This causes query 2 returning
unexpected results. To verify the conclusion, let’s create another TVF, which
gets all the managers for an employee.
IF EXISTS(SELECT 1 FROM sysobjects WHERE ID=OBJECT_ID('dbo.fn_GetAllManagers')) DROP FUNCTION dbo.fn_GetAllManagers GO CREATE FUNCTION dbo.fn_GetAllManagers (@EmpID int) RETURNS @Ancestors TABLE (EmpID int NOT NULL) AS BEGIN IF @EmpID IS NULL BEGIN RETURN END DECLARE @ManagerID int SELECT @ManagerID=ManagerID FROM dbo.t_Emps WHERE EmpID=@EmpID WHILE @ManagerID IS NOT NULL BEGIN INSERT @Ancestors (EmpID) VALUES (@ManagerID) SELECT @ManagerID=ManagerID FROM dbo.t_Emps WHERE EmpID=@ManagerID END RETURN END GO
Query 3: verifications
SET ROWCOUNT 0 SELECT * FROM dbo.fn_GetAllManagers (5) B -- select with TOP SELECT TOP 2 * FROM dbo.fn_GetAllManagers (5) B ORDER BY EmpID DESC SELECT TOP 2 * FROM dbo.fn_GetAllManagers (5) B ORDER BY EmpID ASC SET ROWCOUNT 2 -- select with SET ROWCOUNT SELECT * FROM dbo.fn_GetAllManagers (5) B ORDER BY EmpID DESC SELECT * FROM dbo.fn_GetAllManagers (5) B ORDER BY EmpID ASC SET ROWCOUNT 0 GO
This time both the SELECT statements using TOP and SET
ROWCOUNT 2 return the same results. The execution plan shows that the TVFs in
the two SELECT statements all return 4 rows. This is because in the TVF fn_GetAllManagers,
all the SELECT statements return ONE row each time, which is under the limit
of SET ROWCOUNT 2. If we change the TVF as follows:
ALTER FUNCTION dbo.fn_GetAllManagers (@EmpID int) RETURNS @Ancestors TABLE (EmpID int NOT NULL) AS BEGIN IF @EmpID IS NULL BEGIN RETURN END DECLARE @ManagerID int DECLARE @Ancestors1 TABLE (LevelID int identity not null, EmpID int NOT NULL) SELECT @ManagerID=ManagerID FROM dbo.t_Emps WHERE EmpID=@EmpID WHILE @ManagerID IS NOT NULL BEGIN INSERT @Ancestors1 (EmpID) VALUES (@ManagerID) SELECT @ManagerID=ManagerID FROM dbo.t_Emps WHERE EmpID=@ManagerID END INSERT @Ancestors (EmpID) SELECT EmpID FROM @Ancestors1 ORDER BY LevelID DESC RETURN END GO
Re-run the above queries, the results of the SELECT
statements with TOP and SET ROWCOUNT will not match again.
Query 4: use sub-query
When SET ROWCOUNT ON is used on a SELECT statement with
sub-queries, correct results are always guaranteed regardless of the requested
sorting order. This is because SQL server optimizer handles ordering when
generating the execution plan. The SELECT statement is terminated when the
specified rows have been selected from a set of values that has been sorted
according to specified ORDER BY classification. E.g.:
SET ROWCOUNT 2 SELECT B.* FROM (SELECT EmpID FROM dbo.t_Emps WHERE ManagerID=4) A INNER JOIN dbo.t_Emps B ON A.EmpID=B.EMpID ORDER BY B.EmpID DESC SELECT B.* FROM (SELECT EmpID FROM dbo.t_Emps WHERE ManagerID=4) A INNER JOIN dbo.t_Emps B ON A.EmpID=B.EMpID ORDER BY B.EmpID ASC SET ROWCOUNT 0 GO
In the SELECT with descending order, SQL server scans the
clustered index (column EmpID) in reverse order (ORDERED FORWARD) and seeks the
clustered index in the same table for the same EmpID with ManagerID=4. After 2
rows are scanned (EmpID 7 and 6), 2 rows are found by the “clustered index seek”
operation and the SELECT statement terminates. The “Row count” of the two
operations in the execution plan is both 2.
Execution plan of Query 4 (ORDER BY EmpID DESC)
In the SELECT with ascending order, SQL server scans the
clustered index (column EmpID) in ascending order (ORDERED FORWARD) and seeks
the clustered index in the same table for the same EmpID with ManagerID=4. For
the first 4 (EmpID=1,2,3, and 4) rows scanned, it cannot find any records when
seeking the table with condition ManagerID=4. After 2 more rows are scanned
(EmpID 5 and 6), 2 rows are found by the “clustered index seek” and the SELECT
statements terminates. So the “Row count” for the “clustered index scan”
operation is 6 and it’s 2 for the “clustered index seek” operation.
Execution plan of Query 4 (ORDER BY EmpID ASC)
Conclusion
- Though SET ROWCOUNT n cannot be used in a UDF, the
current ROWCOUNT limit setting of its caller will be applied to the SELECT
statements in the UDF. This is not mentioned by SQL server BOL. The
behavior may lead to unexpected results when querying data from TVFs with
ORDER BY clause.
- When SET ROWCOUNT n applies to a SELECT statement
with sub-queries, the results and ordering are always guaranteed.
- To avoid confusion and unexpected logical errors, it’s
better to turn SET ROWCOUNT n on just before the final SELECT
statement that returns the records. If TVFs are used in the final SELECT
statement, call the TVFs and save the results in temporary tables before
turning the limit on. Use the temporary tables in the final query
statement instead.
- In SQL server 2005, SET ROWCOUNT n has the same behavior
as SQL server 2000.
- In SQL server 2005, TOP operator supports expressions. It’s
recommended to use TOP (n) instead of SET ROWCOUNT n.