January 6, 2016 at 3:20 pm
Oh, one more note. In between all the getting carried away with details and terminology and everyone being right and stuff, I forgot to add my two cents about the original question, which (if I remember correctly) was about over-use of CTEs.
My take on that is that any feature in T-SQL, or in fact any feature in any language, should be used where useful and not used where not. Overuse of any feature is always bad.
January 6, 2016 at 3:30 pm
Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.
Here you go:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K_NoCTE]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM [dbo].[TallyTable] t
WHERE t.N <= DATALENGTH(@pString)
AND SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
GO
I kept all the names for the subqueries to make it easier to identify the parts.
_____________
Code for TallyGenerator
January 6, 2016 at 3:37 pm
Sergiy (1/6/2016)
Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.Here you go:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K_NoCTE]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM [dbo].[TallyTable] t
WHERE t.N <= DATALENGTH(@pString)
AND SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
GO
I kept all the names for the subqueries to make it easier to identify the parts.
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.
January 6, 2016 at 3:46 pm
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.
Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎
_____________
Code for TallyGenerator
January 6, 2016 at 4:17 pm
Sergiy (1/6/2016)
Luis Cazares (1/6/2016)
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.You know, people been splitting delimited lists for many years in SQL2000 and earlier versions.
You may find it hard to believe - but I may assure you - it's true. 🙂
I did it myself.
Some of those split functions I created back then must be running in Prod systems till now.
:w00t:
Edit: Yes, Tally was a static table back then.
I'm not sure it's a bad thing.
I'm sure people been splitting delimited lists for many years and many continue to do it without the help of tally tables. The problem is that those options (when coded only in T-SQL) and the ones you posted won't perform as well as the tested version of the DelimitedSplit8k.
The point of posting the function is to show that there are places where a CTE will be clearly more readable than a subquery. That might not be the case every time and people have preferences (such as using leading or trailing commas) which might not change.
A good developer/DBA should know when to use the best tool for the job and when two or more different tools can the best option. That same person should be able to prove that the chosen tool is actually the best, and not because someone said/wrote it.
If it wasn't clear before, my point is that the over-use is bad, but not using them is bad as well. Or basically what Hugo said:
My take on that is that any feature in T-SQL, or in fact any feature in any language, should be used where useful and not used where not. Overuse of any feature is always bad.
January 6, 2016 at 4:23 pm
Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎
Really, again? Maybe you should read some of Jeff's stuff. He did some testing that shows there is a tipping point when a dynamic tally table becomes more efficient that reading tables from disk.
I'm sure you also take issue with Itzek Ben-Gan and his code as well then.
January 6, 2016 at 4:29 pm
Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎
Congratulations, you did it! Now try it without reading from disk or the buffer.
January 6, 2016 at 4:44 pm
<Removed> got hit by the Reply bug
-- Itzik Ben-Gan 2001
January 6, 2016 at 4:46 pm
Sergiy (1/6/2016)
Lynn Pettis (1/6/2016)
Unfortunately you failed to recreate the function as originally coded. You are relying on the existence of a tally table in the database. The original DelimitedSplit8K is not dependent on the existence of a tally table. And before you go off and say that one should exist in every database, there have been numerous individuals posting on ssc that have said that they can't create a tally table due to corporate or contractual reasons.Oh, c'mon!
Everyone knows how to create a Tally table on fly without CTE:
SELECT TOP 1000000 N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3
😎
Oh come on really? Did you check the query plan?
Next time consider OVER (ORDER BY (SELECT NULL)) to avoid that sort.
You can't use that tally table in a function with SCHEMABINDING either. 😉
-- Itzik Ben-Gan 2001
January 6, 2016 at 5:53 pm
Complete version, including non-CTE on-fly Tally table:
DECLARE @pString VARCHAR(8000), @pDelimiter CHAR(1)
SET @pString = 'aaaa,,ccc,d,'
SET @pDelimiter = ','
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM (
SELECT TOP (DATALENGTH(@pString))
N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3) t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
_____________
Code for TallyGenerator
January 6, 2016 at 6:08 pm
Sergiy (1/6/2016)
Complete version, including non-CTE on-fly Tally table:
DECLARE @pString VARCHAR(8000), @pDelimiter CHAR(1)
SET @pString = 'aaaa,,ccc,d,'
SET @pDelimiter = ','
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM (
SELECT TOP (DATALENGTH(@pString))
N = ROW_NUMBER() OVER(ORDER BY C1.id)
FROM syscolumns c1, syscolumns c2, syscolumns c3) t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
Did you test this before posting it? This is officially the slowest splitter on the internet.
-- Itzik Ben-Gan 2001
January 6, 2016 at 6:17 pm
Luis Cazares (1/6/2016)
Been following this discussion, but I didn't feel like getting in it.I believe that over use of CTEs is as bad as the lack of CTEs. I'd rather hire a developer that knows how to use CTEs than one that doesn't, just because he seems more up to date and willing to learn (They've been available for over 10 years now).
I'll just want some help from those who prefer subqueries over CTEs to transform one of our favorites functions.
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS ( --==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS ( --==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS( --==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
And last, I'm sure that the ones that over use CTEs, would certainly code like this in previous versions:
SELECT *
FROM (
SELECT column1, column2, column3
FROM MyTable
) t
And yes, I've seen code like that.
I don't dislike CTEs by any stretch of the imagination... But a challenge is a challenge... So just for the fun of it...
CREATE FUNCTION dbo.DelimitedSplit8K_DTs
(
@pString VARCHAR(8000),
@pDelimiter CHAR(1)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT
ItemNumber = ROW_NUMBER() OVER (ORDER BY dtLen.N1),
Item = SUBSTRING(@pString, dtLen.N1, dtLen.L1)
FROM (
SELECT
dtStart.N1,
L1 = ISNULL(NULLIF(CHARINDEX(@pDelimiter, @pString, dtStart.N1),0) - dtStart.N1,8000)
FROM (
SELECT N1 = 1 UNION ALL
SELECT Tally.N + 1
FROM (
SELECT TOP (ISNULL(LEN(@pString), 0))
N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E1 (n)
CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E2 (n)
CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E3 (n)
CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) E4 (n)
) Tally
WHERE
SUBSTRING(@pString, Tally.N, 1) = @pDelimiter
) dtStart
) dtLen;
GO
January 6, 2016 at 6:28 pm
ALTER FUNCTION [dbo].[DelimitedSplit8K_NoCTE]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY cteLen.N1),
Item = SUBSTRING(@pString, cteLen.N1, cteLen.L1)
FROM(--==== Return start and length (for use in substring)
SELECT cteStart.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,cteStart.N1),0)-cteStart.N1,8000)
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1
FROM (
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (SELECT 1
FROM (SELECT 1 FROM (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) a (N) , (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) b (N)
) a (N),
(SELECT 1 FROM (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) a (N), (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) b (N)
) b (N)
) E4 (N)
) t (N)
WHERE t.N <= DATALENGTH(@pString)
AND SUBSTRING(@pString,t.N,1) = @pDelimiter
) cteStart(N1)
) cteLen(N1,L1)
GO
Performance is identical to CTE version.
_____________
Code for TallyGenerator
January 6, 2016 at 6:39 pm
But you were 11 mins too slow... 😛
January 6, 2016 at 6:43 pm
Eh, I decided to join in the fun and got beat to it (I realized what I posted earlier had CTEs in it)
Using delimitedsplit8K logic... here's one that's not limited to a char(1) delimiter:
USE tempdb
GO
DECLARE
@string varchar(8000) = 'xxx<br/>ryyy<br/>rzzz',
@delimiter varchar(100) = '<br/>';
SELECT Item = SUBSTRING(@string, l.N1, l.L1)
FROM
(
SELECT s.N1, L1 = ISNULL(NULLIF(CHARINDEX(@delimiter,@string,s.N1),0)-s.N1,8000)
FROM
(
SELECT N1 = 1 UNION ALL
SELECT t.N+LEN(@delimiter)
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM
(
SELECT 1 FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(N) -- 90 values
) a(x),
(
SELECT 1 FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(N) -- 90 values
) b(x) -- 8100
) t(N)
WHERE SUBSTRING(@string,t.N,LEN(NULLIF(@delimiter,''))) = @delimiter
) s
) l;
GO
-- Itzik Ben-Gan 2001
Viewing 15 posts - 91 through 105 (of 161 total)
You must be logged in to reply to this topic. Login to reply