the use of CTEs

Question

the use of CTEs

Viewing 15 posts - 91 through 105 (of 161 total)

You must be logged in to reply to this topic. Login to reply

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 1

Oh, one more note. In between all the getting carried away with details and terminology and everyone being right and stuff, I forgot to add my two cents about the original question, which (if I remember correctly) was about over-use of CTEs.

My take on that is that any feature in T-SQL, or in fact any feature in any language, should be used where useful and not used where not. Overuse of any feature is always bad.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110208 More actions · Answer 2

Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.

Here you go:

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE FUNCTION [dbo].[DelimitedSplit8K_NoCTE]

--===== Define I/O parameters

(@pString VARCHAR(8000), @pDelimiter CHAR(1))

--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!

RETURNS TABLE AS

RETURN

--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...

-- enough to cover VARCHAR(8000)

SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),

Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)

FROM(--==== Return start and length (for use in substring)

SELECT cteStart.N1,

ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)

FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

SELECT 1 UNION ALL

SELECT t.N+1 FROM [dbo].[TallyTable] t

WHERE t.N <= DATALENGTH(@pString)

AND SUBSTRING(@pString,t.N,1) = @pDelimiter

) cteStart(N1)

) cteLen(N1,L1)

GO

I kept all the names for the subqueries to make it easier to identify the parts.

_____________
Code for TallyGenerator

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Sergiy (1/6/2016)
Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.
Here you go:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K_NoCTE]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM [dbo].[TallyTable] t
WHERE t.N <= DATALENGTH(@pString)
AND SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
GO
I kept all the names for the subqueries to make it easier to identify the parts.

Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.

Sergiy SSC Guru Points: 110208 More actions · Answer 4

Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.

Oh, c'mon!

Everyone knows how to create a Tally table on fly without CTE:

SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)

FROM syscolumns c1, syscolumns c2, syscolumns c3

😎

_____________
Code for TallyGenerator

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

Sergiy (1/6/2016)
Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.
You know, people been splitting delimited lists for many years in SQL2000 and earlier versions.
You may find it hard to believe - but I may assure you - it's true. 🙂
I did it myself.
Some of those split functions I created back then must be running in Prod systems till now.
:w00t:
Edit: Yes, Tally was a static table back then.
I'm not sure it's a bad thing.

I'm sure people been splitting delimited lists for many years and many continue to do it without the help of tally tables. The problem is that those options (when coded only in T-SQL) and the ones you posted won't perform as well as the tested version of the DelimitedSplit8k.

The point of posting the function is to show that there are places where a CTE will be clearly more readable than a subquery. That might not be the case every time and people have preferences (such as using leading or trailing commas) which might not change.

A good developer/DBA should know when to use the best tool for the job and when two or more different tools can the best option. That same person should be able to prove that the chosen tool is actually the best, and not because someone said/wrote it.

If it wasn't clear before, my point is that the over-use is bad, but not using them is bad as well. Or basically what Hugo said:

My take on that is that any feature in T-SQL, or in fact any feature in any language, should be used where useful and not used where not. Overuse of any feature is always bad.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.
Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎

Really, again? Maybe you should read some of Jeff's stuff. He did some testing that shows there is a tipping point when a dynamic tally table becomes more efficient that reading tables from disk.

I'm sure you also take issue with Itzek Ben-Gan and his code as well then.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 7

Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.
Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎

Congratulations, you did it! Now try it without reading from disk or the buffer.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Alan Burstein SSC Guru Points: 61141 More actions · Answer 8

<Removed> got hit by the Reply bug

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Alan Burstein SSC Guru Points: 61141 More actions · Answer 9

Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.
Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎

Oh come on really? Did you check the query plan?

Next time consider OVER (ORDER BY (SELECT NULL)) to avoid that sort.

You can't use that tally table in a function with SCHEMABINDING either. 😉

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Sergiy SSC Guru Points: 110208 More actions · Answer 10

Complete version, including non-CTE on-fly Tally table:

DECLARE @pString VARCHAR(8000), @pDelimiter CHAR(1)

SET @pString = 'aaaa,,ccc,d,'

SET @pDelimiter = ','

SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),

Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)

FROM(--==== Return start and length (for use in substring)

SELECT cteStart.N1,

ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)

FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

SELECT 1 UNION ALL

SELECT t.N+1 FROM (

SELECT TOP (DATALENGTH(@pString))

N = ROW_NUMBER() OVER(ORDER BY C1.id)

FROM syscolumns c1, syscolumns c2, syscolumns c3) t

WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter

) cteStart(N1)

) cteLen(N1,L1)

_____________
Code for TallyGenerator

Alan Burstein SSC Guru Points: 61141 More actions · Answer 11

Sergiy (1/6/2016)
Complete version, including non-CTE on-fly Tally table:
DECLARE @pString VARCHAR(8000), @pDelimiter CHAR(1)
SET @pString = 'aaaa,,ccc,d,'
SET @pDelimiter = ','
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM (
SELECT TOP (DATALENGTH(@pString))
N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3) t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)

Did you test this before posting it? This is officially the slowest splitter on the internet.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jason A. Long SSC-Insane Points: 23788 More actions · Answer 12

Luis Cazares (1/6/2016)
Been following this discussion, but I didn't feel like getting in it.
I believe that over use of CTEs is as bad as the lack of CTEs. I'd rather hire a developer that knows how to use CTEs than one that doesn't, just because he seems more up to date and willing to learn (They've been available for over 10 years now).
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS ( --==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS ( --==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS( --==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
And last, I'm sure that the ones that over use CTEs, would certainly code like this in previous versions:
SELECT *
FROM (
SELECT column1, column2, column3
FROM MyTable
) t
And yes, I've seen code like that.

I don't dislike CTEs by any stretch of the imagination... But a challenge is a challenge... So just for the fun of it...

CREATE FUNCTION dbo.DelimitedSplit8K_DTs

(

@pString VARCHAR(8000),

@pDelimiter CHAR(1)

)

RETURNS TABLE WITH SCHEMABINDING AS

RETURN

SELECT

ItemNumber = ROW_NUMBER() OVER (ORDER BY dtLen.N1),

Item = SUBSTRING(@pString, dtLen.N1, dtLen.L1)

FROM (

SELECT

dtStart.N1,

L1 = ISNULL(NULLIF(CHARINDEX(@pDelimiter, @pString, dtStart.N1),0) - dtStart.N1,8000)

FROM (

SELECT N1 = 1 UNION ALL

SELECT Tally.N + 1

FROM (

SELECT TOP (ISNULL(LEN(@pString), 0))

N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E1 (n)

CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E2 (n)

CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E3 (n)

CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E4 (n)

) Tally

WHERE

SUBSTRING(@pString, Tally.N, 1) = @pDelimiter

) dtStart

) dtLen;

GO

Sergiy SSC Guru Points: 110208 More actions · Answer 13

ALTER FUNCTION [dbo].[DelimitedSplit8K_NoCTE]

--===== Define I/O parameters

(@pString VARCHAR(8000), @pDelimiter CHAR(1))

--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!

RETURNS TABLE AS

RETURN

--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...

-- enough to cover VARCHAR(8000)

SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),

Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)

FROM(--==== Return start and length (for use in substring)

SELECT cteStart.N1,

ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)

FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

SELECT 1 UNION ALL

SELECT t.N+1

FROM (

SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM (SELECT 1

FROM (SELECT 1 FROM (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

) a (N) , (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

) b (N)

) a (N),

(SELECT 1 FROM (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

) a (N), (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

) b (N)

) E4 (N)

) t (N)

WHERE t.N <= DATALENGTH(@pString)

AND SUBSTRING(@pString,t.N,1) = @pDelimiter

) cteStart(N1)

) cteLen(N1,L1)

GO

Performance is identical to CTE version.

_____________
Code for TallyGenerator

Jason A. Long SSC-Insane Points: 23788 More actions · Answer 14

Jason A. Long

SSC-Insane

Points: 23788

January 6, 2016 at 6:39 pm

#1850085

But you were 11 mins too slow... 😛

Alan Burstein SSC Guru Points: 61141 More actions · Answer 15

Eh, I decided to join in the fun and got beat to it (I realized what I posted earlier had CTEs in it)

Using delimitedsplit8K logic... here's one that's not limited to a char(1) delimiter:

USE tempdb

GO

DECLARE

@string varchar(8000) = 'xxx<br/>ryyy<br/>rzzz',

@delimiter varchar(100) = '<br/>';

SELECT Item = SUBSTRING(@string, l.N1, l.L1)

FROM

(

SELECT s.N1, L1 = ISNULL(NULLIF(CHARINDEX(@delimiter,@string,s.N1),0)-s.N1,8000)

FROM

(

SELECT N1 = 1 UNION ALL

SELECT t.N+LEN(@delimiter)

FROM

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM

(

SELECT 1 FROM (VALUES

(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),

(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(N) -- 90 values

) a(x),

(

SELECT 1 FROM (VALUES

(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),

(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(N) -- 90 values

) b(x) -- 8100

) t(N)

WHERE SUBSTRING(@string,t.N,LEN(NULLIF(@delimiter,''))) = @delimiter

) s

) l;

GO

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001