INSERT INTO with tally is slower than INSERT INTO within WHILE LOOP

Question

Post reply

INSERT INTO with tally is slower than INSERT INTO within WHILE LOOP

BOR15K

SSCertifiable

Points: 5894

June 13, 2023 at 3:20 pm

#4210338

My colleague wrote a following function to generate a table with 10,000,000 records, which works absolutely fine:

DROP FUNCTION IF EXISTS [dbo].[get_generate_sequence_fn];
GO
create FUNCTION [dbo].[get_generate_sequence_fn](
@sequence_start INT,
@sequence_end INT,
@incremental_value INT
)

RETURNS @t_genarated_values TABLE (
value INT
)
AS

BEGIN

WHILE ( @sequence_start <= @sequence_end)
BEGIN
INSERT INTO @t_genarated_values
select @sequence_start
set @sequence_start+=@incremental_value

END
RETURN;
END;

I suggested to change it to use tally, assuming it will be significantly faster, but to my surprise the execution plan for 10M records shows it is actually slower. Can one explain me why, please? Apologies, I cannot paste the screenshots of the execution plans for both.

DROP FUNCTION IF EXISTS [dbo].[get_generate_sequence_tally_fn];
GO
create FUNCTION [dbo].[get_generate_sequence_tally_fn](
@i_sequence_start INT,
@i_sequence_end INT,
@i_inc_val INT
)


RETURNS @t_genarated_values TABLE (
value INT
)
AS
BEGIN


WITH cte_tally (n) AS
(
    
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
    FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)
    CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n) -- tally table for 400 records
    CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n) -- tally table for 8000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n) -- tally table for 160,000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e(n) -- tally table for 3,200,000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f(n) -- tally table for 3,200,000 records
)
INSERT INTO @t_genarated_values
SELECT @i_sequence_start + @i_inc_val*(n-1) 
FROM cte_tally
WHERE n<= CEILING((@i_sequence_end - @i_sequence_start+1)/@i_inc_val)


RETURN;
END;

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246960 More actions · Answer 1

Out of interest, can you try this version?

INSERT ...
SELECT TOP((@i_sequence_end - @i_sequence_start + 1) / @i_inc_val)
       value = (ROW_NUMBER () OVER (ORDER BY(SELECT NULL))) * @i_inc_val - 1 + @i_sequence_start
FROM sys.all_columns ac1
    CROSS JOIN sys.all_columns ac2;

Note also that you have misspelled 'generated'.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

BOR15K SSCertifiable Points: 5894 More actions · Answer 2

Thank you, Phil - the execution plan is the same. And same as with my tally table, 94% of it goes on INSERT INTO @t table

Jonathan AC Roberts SSCoach Points: 18354 More actions · Answer 3

BOR15K wrote:

Thank you, Phil - the execution plan is the same. And same as with my tally table, 94% of it goes on INSERT INTO @t table

You don't need to insert into a table variable:

DROP FUNCTION [dbo].[get_generate_sequence_tally_fn];
GO
create FUNCTION [dbo].[get_generate_sequence_tally_fn](
@i_sequence_start INT,
@i_sequence_end INT,
@i_inc_val INT
)

RETURNS TABLE AS
RETURN WITH cte_tally (n) AS
(
    
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
    FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)
    CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n) -- tally table for 400 records
    CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n) -- tally table for 8000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n) -- tally table for 160,000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e(n) -- tally table for 3,200,000 records
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f(n) -- tally table for 3,200,000 records
)
SELECT TOP(CEILING((@i_sequence_end - @i_sequence_start+1)/@i_inc_val)) @i_sequence_start + @i_inc_val*(n-1) value
FROM cte_tally
;
GO

BOR15K SSCertifiable Points: 5894 More actions · Answer 4

Thank you, Jonathan. Almost every step in the execution plan now marked as "No Join Predicate" and the whole function now costs 100% with my original tally one costing 0%

Jeff Moden SSC Guru Points: 1004324 More actions · Answer 5

BOR15K wrote:

Thank you, Jonathan. Almost every step in the execution plan now marked as "No Join Predicate" and the whole function now costs 100% with my original tally one costing 0%

So what? How does it perform?

The reason your original took 0% is because execution plans for a WHILE loop are based only on the first iteration of the WHILE loop. Stop using % of batch as a performance indicator. It's based on estimates and not reality.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18354 More actions · Answer 6

Jeff Moden wrote:

BOR15K wrote:
Thank you, Jonathan. Almost every step in the execution plan now marked as "No Join Predicate" and the whole function now costs 100% with my original tally one costing 0%
So what? How does it perform?
The reason your original took 0% is because execution plans for a WHILE loop are based only on the first iteration of the WHILE loop. Stop using % of batch as a performance indicator. It's based on estimates and not reality.

The original while loop is appalling compared to a tally without inserting into, here is a test of 10 million rows of the while loop vs. a tally with no inserts:

SET STATISTICS IO, TIME ON
go
SELECT * 
into #t1
FROM  [dbo].[get_generate_sequence_fn] (1, 10000000, 1)
;

go
SELECT * 
into #t2
FROM  [dbo].[get_generate_sequence_tally_fn](1, 10000000, 1)
;

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
Table '#B21ED994'. Scan count 1, logical reads 16892, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 406485 ms,  elapsed time = 848383 ms.

(10000000 rows affected)
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 5 ms.

 SQL Server Execution Times:
   CPU time = 3997 ms,  elapsed time = 1671 ms.

(10000000 rows affected)

So it is over 500 times faster using a tally table with no inserts: 1.7 seconds compared to 14 minutes.

frederico_fonseca SSCoach Points: 16154 More actions · Answer 7

times on a I7 6 core laptop

according to explan plan query cost (relative to batch= are 0, 0, 100% respectively - so DEFINITELY NOT something to use.

- with while loop - [get_generate_sequence_fn]

Table '#BF78D196'. Scan count 1, logical reads 16892, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 1215937 ms, elapsed time = 1273212 ms.

(10000000 rows affected)

-- with @table insert - [get_generate_sequence_tally_fn]

Table '#BC9C64EB'. Scan count 1, logical reads 16078, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 34281 ms, elapsed time = 34503 ms.

(10000000 rows affected)

-- ITVF - [get_generate_sequence_tally_fn_itvf]

SQL Server Execution Times:

CPU time = 10671 ms, elapsed time = 8476 ms.

(10000000 rows affected)

BOR15K SSCertifiable Points: 5894 More actions · Answer 8

Thank you all for the valuable replies, but I think this discussion diverted from my original question: why my tally table's function had worse execution plan than my colleague's simple WHILE LOOP. Jonathan above has already advised this is because the execution plan is based on only one iteration of the WHILE LOOP. Is it documented somewhere, so I can share it across our team, please?

And yes, my tally works MUCH faster than the loop.

Jeff Moden SSC Guru Points: 1004324 More actions · Answer 9

BOR15K wrote:

Thank you all for the valuable replies, but I think this discussion diverted from my original question: why my tally table's function had worse execution plan than my colleague's simple WHILE LOOP. Jonathan above has already advised this is because the execution plan is based on only one iteration of the WHILE LOOP. Is it documented somewhere, so I can share it across our team, please?
And yes, my tally works MUCH faster than the loop.

Just to make the minor correction, I was the one to talk about the single iteration. Although I've not done an extensive search for it, I don't believe there's any Microsoft Documentation on that subject.

@Grant Fritchey has written several books on the subject. He might know where MS documented it (if it is documented) but it doesn't seem like something that MS would document because I don't believe they actually know either.

As a bit of a sidebar, does anyone know where the documentation is for "Query Cost (relative to the batch)" documentation for execution plans is? I've done a couple of quick searches and can't find even that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

BOR15K SSCertifiable Points: 5894 More actions · Answer 10

Jeff Moden wrote:

BOR15K wrote:
Thank you all for the valuable replies, but I think this discussion diverted from my original question: why my tally table's function had worse execution plan than my colleague's simple WHILE LOOP. Jonathan above has already advised this is because the execution plan is based on only one iteration of the WHILE LOOP. Is it documented somewhere, so I can share it across our team, please?
And yes, my tally works MUCH faster than the loop.
Just to make the minor correction, I was the one to talk about the single iteration. Although I've not done an extensive search for it, I don't believe there's any Microsoft Documentation on that subject.
@Grant Fritchey has written several books on the subject. He might know where MS documented it (if it is documented) but it doesn't seem like something that MS would document because I don't believe they actually know either.
As a bit of a sidebar, does anyone know where the documentation is for "Query Cost (relative to the batch)" documentation for execution plans is? I've done a couple of quick searches and can't find even that.

Thank you, Jeff.

I could only find his reply to you 🙂

https://www.sqlservercentral.com/forums/topic/query-cost-relative-to-the-batch-vs-statistics-io-vs-extended-events