June 26, 2023 at 6:23 am
Do we get any performance benefit by moving the TOP clause up into the Tally cte.
My gut says it is better to reduce the actual number of items as soon as possible.
However, a quick test seems to disagree
SET STATISTICS TIME, IO ON;
SELECT * FROM dbo.GENERATE_SERIES(3, 17000000, 5) AS gs
SET STATISTICS TIME, IO OFF;
LATE Filtering ( as written)
(3400000 rows affected)
SQL Server Execution Times:
CPU time = 809 ms, elapsed time = 7776 ms.
EARLY Filtering (move TOP up into Tally cte)
(3400000 rows affected)
SQL Server Execution Times:
CPU time = 1186 ms, elapsed time = 8001 ms.
June 26, 2023 at 6:29 am
However, doing a COUNT(*) so that we remove the time to push to display, seems to prefer the early filtering for CPU usage. Elapsed time still seems to prefer the late filtering.
SET STATISTICS TIME, IO ON;
SELECT COUNT(*) FROM dbo.GENERATE_SERIES(3, 17000000, 5) AS gs
SET STATISTICS TIME, IO OFF;
LATE Filtering ( as written)
SQL Server Execution Times:
CPU time = 282 ms, elapsed time = 72 ms.
EARLY Filtering (move TOP up into Tally cte)
SQL Server Execution Times:
CPU time = 249 ms, elapsed time = 85 ms.
June 26, 2023 at 12:49 pm
I have developed two additional versions of the function and incorporated Microsoft's GENERATE_SEQUENCE in a test script.
The code for dbo.GENERATE_SEQUENCE
is provided in the script article. It is written as follows:
, Tally AS (
SELECT N = ROW_NUMBER() OVER (ORDER BY N)
FROM H8
)
, StepTally AS (
SELECT value = @Start + (@Step * (N - 1))
FROM Tally
)
SELECT TOP(ABS(@Stop - @Start) / ABS(@Step) + 1) value
FROM StepTally
WHERE @Step <> 0
The code for dbo.GENERATE_SEQUENCE2
is as follows:
, Tally AS (
SELECT TOP(ABS(@Stop - @Start) / ABS(@Step) + 1) N = ROW_NUMBER() OVER (ORDER BY N)
FROM H8
)
, StepTally AS (
SELECT value = @Start + (@Step * (N - 1))
FROM Tally
)
SELECT value
FROM StepTally
WHERE @Step <> 0
The code for dbo.GENERATE_SEQUENCE3
is presented below:
SELECT TOP(ABS(@Stop - @Start) / ABS(@Step) + 1)
@Start + (@Step * (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1)) value
FROM H8
WHERE @Step <> 0
Here is the test script:
SET NOCOUNT ON;
SET STATISTICS TIME, IO OFF;
GO
DROP TABLE IF EXISTS #Results
CREATE TABLE #Results(ID int identity(1,1), Method varchar(30), TimeMS int)
GO
DECLARE @X int
DECLARE @StartTime datetime2(7) = SYSDATETIME();
DECLARE @Max int = 1700000
SELECT @X=value
FROM dbo.GENERATE_SERIES(3, @Max, 5)
DECLARE @TimeMS int = DATEDIFF_BIG(mcs, @StartTime, SYSDATETIME())
INSERT INTO #Results (Method, TimeMS) VALUES ('dbo.GENERATE_SERIES:', @TimeMS)
PRINT CONCAT('GENERATE_SERIES:', @TimeMS);
SET @StartTime = SYSDATETIME();
SELECT @X=value
FROM dbo.GENERATE_SERIES2(3, @Max, 5)
SET @TimeMS = DATEDIFF_BIG(mcs, @StartTime, SYSDATETIME())
INSERT INTO #Results (Method, TimeMS) VALUES ('dbo.GENERATE_SERIES2:', @TimeMS)
PRINT CONCAT('GENERATE_SERIES2:', @TimeMS);
SET @StartTime = SYSDATETIME();
SELECT @X=value
FROM dbo.GENERATE_SERIES3(3, @Max, 5)
SET @TimeMS = DATEDIFF_BIG(mcs, @StartTime, SYSDATETIME())
INSERT INTO #Results (Method, TimeMS) VALUES ('dbo.GENERATE_SERIES3:', @TimeMS)
PRINT CONCAT('GENERATE_SERIES3:', @TimeMS);
SET @StartTime = SYSDATETIME();
SELECT @X=value
FROM GENERATE_SERIES(3, @Max, 5)
SET @TimeMS = DATEDIFF_BIG(mcs, @StartTime, SYSDATETIME())
INSERT INTO #Results(Method, TimeMS) VALUES ('(MS) GENERATE_SERIES:', @TimeMS)
PRINT CONCAT('MSGENERATE_SERIES:', @TimeMS);
GO 100
SELECT Method, AVG(TimeMS) [Average mcs]
FROM #Results
GROUP BY Method
ORDER BY 1;
The test script, which I have provided, executes each Table-valued Function (TVF) 100 times and logs each result to #Results. Subsequently, it presents the average execution time.
The results of two different sets are provided below. All results show comparable performance, within the expected variation for running SQL statements.
June 26, 2023 at 6:25 pm
Comments posted to this topic are about the item User-Created GENERATE_SERIES() Function for Older SQL Server Versions
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply