August 26, 2010 at 2:29 am
Hi there,
I've seen most people here using CTE but... does it really have a limit?
WITH CTE (ctr1) AS (
SELECT 1
UNION ALL
SELECT ctr1+1
FROM CTE
WHERE ctr1<32767
)
SELECT DISTINCT ctr1 FROM CTE c1
by default, it has a limit of 100
using OPTION (MAXRECURSION 32767) you can limit it up to 32767 or any lower number you specified. (32768 and above is not allowed, you can try it if you want?)
WITH CTE (ctr1) AS (
SELECT 1
UNION ALL
SELECT ctr1+1
FROM CTE
WHERE ctr1<32767
)
SELECT DISTINCT ctr1 FROM CTE c1
OPTION (MAXRECURSION 32767)
can CTE surpass that 32767 limit?
NOTE: these are just simple examples, I know there are other ways to generate numbers, I just wanna know how to pass that 32767 limit
Thanks in advance guys. ^__^
_____________________________________________
[font="Comic Sans MS"]Quatrei Quorizawa[/font]
:):D:P;):w00t::cool::hehe:
MABUHAY PHILIPPINES!
August 26, 2010 at 2:58 am
Quatrei.X (8/26/2010)
can CTE surpass that 32767 limit?
No. And there's no need for it to either. Generally, if you've got a CTE that exceeds 100, then you've got a circular reference so it's your SQL that needs correcting.
-EDIT-
:blush: Whoopsie, I'm wrong
August 26, 2010 at 3:37 am
Yes. You can surpass the limit by setting the value for MAXRECURSION as 0.
OPTION (MAXRECURSION 0)
But always make sure you don't go into an infinite loop by using this option.
How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/
August 26, 2010 at 3:38 am
hmmmm...
:hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.
🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.
😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.
_____________________________________________
[font="Comic Sans MS"]Quatrei Quorizawa[/font]
:):D:P;):w00t::cool::hehe:
MABUHAY PHILIPPINES!
August 26, 2010 at 3:43 am
Kingston Dhasian (8/26/2010)
Yes. You can surpass the limit by setting the value for MAXRECURSION as 0.
OPTION (MAXRECURSION 0)
But always make sure you don't go into an infinite loop by using this option.
:w00t: Thanks man!
hehehe, funny, using zero came to my mind earlier but thought it was not worth trying... guess we should really try and try until we die :hehe:
_____________________________________________
[font="Comic Sans MS"]Quatrei Quorizawa[/font]
:):D:P;):w00t::cool::hehe:
MABUHAY PHILIPPINES!
August 26, 2010 at 3:50 am
Quatrei.X (8/26/2010)
hmmmm...:hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.
🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.
😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.
CTE is not really a faster option it is still looping and it will grind to a halt as more data is processed, XML would probably be the best out of those options.
However a much better way is to use a tally table and use set-based logic
http://www.sqlservercentral.com/articles/T-SQL/62867/"> http://www.sqlservercentral.com/articles/T-SQL/62867/
August 26, 2010 at 3:50 am
Quatrei.X (8/26/2010)
hmmmm...:hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.
🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.
😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.
Have a look at the link below. This is one of the best delimited string split functions available
http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx#bm944589
How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/
August 27, 2010 at 1:28 am
:w00t: First of all, thanks kingston, steveb and skcadavre
I saw the links using tally via table and CTE
I have 2 follow up questions.
Question 1:
Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)
Question 2:
I have seen Jeff Moden's dbo.DelimitedSplit8K function sample but why do "multiple CTE's" (not sure what their called... anyways, lets look at these two examples.
Jeff Moden's CTE in his dbo.DelimitedSplit8K
WITH
E1(N) AS ( --=== Create Ten 1s
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
SELECT N FROM cteTally
The CTE I was using to generate numbers
;WITH CTE (ctr1) AS (
SELECT 1
UNION ALL
SELECT ctr1+1 FROM CTE WHERE ctr1<10000
)
SELECT DISTINCT ctr1 FROM CTE c1
OPTION (MAXRECURSION 10000)
both have the same records
I tried running both codes in one script...
Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)
btw, both runs less than a second, I just wanna gain more knowledge hehehe. :hehe:
_____________________________________________
[font="Comic Sans MS"]Quatrei Quorizawa[/font]
:):D:P;):w00t::cool::hehe:
MABUHAY PHILIPPINES!
August 27, 2010 at 2:29 am
Quatrei.X (8/27/2010)
Question 1:Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)
Jeff Moden's CTE in his dbo.DelimitedSplit8K
WITH
E1(N) AS ( --=== Create Ten 1s
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
SELECT N FROM cteTally
The CTE I was using to generate numbers
;WITH CTE (ctr1) AS (
SELECT 1
UNION ALL
SELECT ctr1+1 FROM CTE WHERE ctr1<10000
)
SELECT DISTINCT ctr1 FROM CTE c1
OPTION (MAXRECURSION 10000)
Jeff's Tally CTE beats your Recursive CTE BY quite a margin, Quatrei.. Here are my test rig results after five trials running both of the above queries in separate windows.
-- Jeff Moden's Tally:
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 5 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
-- Quatrie's CTE
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 151 ms.
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 154 ms.
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 156 ms.
SQL Server Execution Times:
CPU time = 141 ms, elapsed time = 150 ms.
SQL Server Execution Times:
CPU time = 140 ms, elapsed time = 149 ms.
Look at the CPU time the recursive CTE takes ; this is just for 10000 rows, what if you are generating > 1 million rows..hmmm....
both have the same records
I tried running both codes in one script...
Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)
Actual Execution Plan will "also" , at times, lie...There are lot of things the code you posted do.. the codes will produce the results in the "Results" pane, so that will also be included in the query cost..hmmm...Hope this clears the air..
August 27, 2010 at 2:57 am
Quatrei.X (8/27/2010)
Query costs relative to batch is 84% for the first one and 16% for the second.
Those costs are estimates. There are lots of things that make the estimates wrong, one being an inaccurate row count estimate. If you look at more than just the cost estimate on the exec plans, you'll notice that the row count estimations of the CTE are completely inaccurate.
Estimated rows: 2
Actual rows: 10000
That's going to really mess up those costs and as a results the estimates.
I need to do more reading and investigation but, from what I remember, a recursive CTE is typically costed very badly estimated by the optimiser.
I'm just wondering why Jeff did this approach.
Because it's incredibly fast. Also that approach is widely used. If I recall, I've seen Itzik Ben-Gan use it, so it's not just Jeff.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 27, 2010 at 9:53 am
Hi there,
Here's another method for splitting comma separated values using recursive cte without using tally table.. When the string contains more than 8000 characters, this performs quite faster (otherwise DelimitedSplit8K function is better)..:-)
CREATE FUNCTION dbo.Split (@string VARCHAR(MAX), @delimiter CHAR(1))
RETURNS TABLE
AS
RETURN (
WITH Temp(id, position1, position2)
AS (
SELECT1, CAST(1 AS INT), CAST(CHARINDEX(@delimiter, @string) AS INT)
UNION ALL
SELECTid + 1, CAST(position2 AS INT) + 1, CAST(CHARINDEX(@delimiter, @string, position2 + 1) AS INT)
FROMTemp
WHEREposition2 > 0
)
SELECT id,SUBSTRING(@string, position1, CASE WHEN position2 = 0 THEN LEN(@string) + 1 - position1 ELSE position2 - position1 END) AS word
FROM Temp
)
GO
DECLARE @string VARCHAR(MAX)
SELECT @string = STUFF((SELECT ',' + CAST(ABS(CHECKSUM(NEWID())%10000) AS VARCHAR) AS [text()]
FROM sys.all_columns a--, sys.all_columns b
FOR XML PATH('')),1,1,'')
SELECT LEN(@string)
SET STATISTICS TIME ON
SELECT * FROM dbo.Split(@string,',') OPTION(MAXRECURSION 0)
SET STATISTICS TIME OFF
DROP FUNCTION dbo.Split
I tried modifying DelimitedSplit8K function to accept varchar(max) (although it's already noted there that varchar(max) would affect its performance) and increase the tally table (just for testing :-D).. The rows generated are incomplete yet because I was not able to generate enough numbers for tally table but Split function is 2 seconds faster.. :w00t:
I have also seen Oleg's procedure in another post using XML and include it in this testing.. It's cool and it's faster! Thanks for your procedure! 🙂
Overall, this is the execution summary on my machine: (character length generated=1193011; expected rows=244036)
--DelimitedSplit8K function
(204542 row(s) affected)
SQL Server Execution Times:
CPU time = 12890 ms, elapsed time = 13087 ms.
--Split function
(244036 row(s) affected)
SQL Server Execution Times:
CPU time = 9984 ms, elapsed time = 10575 ms.
--Oleg's usp_DelimitedSplit stored procedure
SQL Server parse and compile time:
CPU time = 54 ms, elapsed time = 54 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 34 ms.
SQL Server Execution Times:
CPU time = 3891 ms, elapsed time = 3881 ms.
(244036 row(s) affected)
SQL Server Execution Times:
CPU time = 2953 ms, elapsed time = 3170 ms.
SQL Server Execution Times:
CPU time = 6938 ms, elapsed time = 7145 ms.
When I tried to create inline function for Oleg's stored procedure, the query becomes slower.. I used derived tables and cross apply though.. Did I miss something?
"Actual Execution Plan will "also" , at times, lie".. nice one ColdCoffee! Execution time is the best metric to measure performance..
I hope this helps! 😀
August 27, 2010 at 10:20 am
KurtdTajsadz (8/27/2010)
hi, i am Ronald Powelthanks all my friends
Reported as spam.
August 27, 2010 at 6:29 pm
shield_21 (8/27/2010)
Hi there,Here's another method for splitting comma separated values using recursive cte without using tally table.. When the string contains more than 8000 characters, this performs quite faster (otherwise DelimitedSplit8K function is better)..:-)
CREATE FUNCTION dbo.Split (@string VARCHAR(MAX), @delimiter CHAR(1))
RETURNS TABLE
AS
RETURN (
WITH Temp(id, position1, position2)
AS (
SELECT1, CAST(1 AS INT), CAST(CHARINDEX(@delimiter, @string) AS INT)
UNION ALL
SELECTid + 1, CAST(position2 AS INT) + 1, CAST(CHARINDEX(@delimiter, @string, position2 + 1) AS INT)
FROMTemp
WHEREposition2 > 0
)
SELECT id,SUBSTRING(@string, position1, CASE WHEN position2 = 0 THEN LEN(@string) + 1 - position1 ELSE position2 - position1 END) AS word
FROM Temp
)
GO
DECLARE @string VARCHAR(MAX)
SELECT @string = STUFF((SELECT ',' + CAST(ABS(CHECKSUM(NEWID())%10000) AS VARCHAR) AS [text()]
FROM sys.all_columns a--, sys.all_columns b
FOR XML PATH('')),1,1,'')
SELECT LEN(@string)
SET STATISTICS TIME ON
SELECT * FROM dbo.Split(@string,',') OPTION(MAXRECURSION 0)
SET STATISTICS TIME OFF
DROP FUNCTION dbo.Split
I tried modifying DelimitedSplit8K function to accept varchar(max) (although it's already noted there that varchar(max) would affect its performance) and increase the tally table (just for testing :-D).. The rows generated are incomplete yet because I was not able to generate enough numbers for tally table but Split function is 2 seconds faster.. :w00t:
I have also seen Oleg's procedure in another post using XML and include it in this testing.. It's cool and it's faster! Thanks for your procedure! 🙂
Overall, this is the execution summary on my machine: (character length generated=1193011; expected rows=244036)
--DelimitedSplit8K function
(204542 row(s) affected)
SQL Server Execution Times:
CPU time = 12890 ms, elapsed time = 13087 ms.
--Split function
(244036 row(s) affected)
SQL Server Execution Times:
CPU time = 9984 ms, elapsed time = 10575 ms.
--Oleg's usp_DelimitedSplit stored procedure
SQL Server parse and compile time:
CPU time = 54 ms, elapsed time = 54 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 34 ms.
SQL Server Execution Times:
CPU time = 3891 ms, elapsed time = 3881 ms.
(244036 row(s) affected)
SQL Server Execution Times:
CPU time = 2953 ms, elapsed time = 3170 ms.
SQL Server Execution Times:
CPU time = 6938 ms, elapsed time = 7145 ms.
When I tried to create inline function for Oleg's stored procedure, the query becomes slower.. I used derived tables and cross apply though.. Did I miss something?
"Actual Execution Plan will "also" , at times, lie".. nice one ColdCoffee! Execution time is the best metric to measure performance..
I hope this helps! 😀
I'm testing Oleg's code method. The place that most XML methods fall apart in is when there are multiple rows. They do work fast as all get out on a single row, though.
Speaking of testing, if you're going to make claims of speed testing, would you mind posting the code you used to generate the test data and your full test harness so we can verify your findings? Thanks.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 27, 2010 at 6:33 pm
GilaMonster (8/27/2010)
Quatrei.X (8/27/2010)
Query costs relative to batch is 84% for the first one and 16% for the second.Those costs are estimates. There are lots of things that make the estimates wrong, one being an inaccurate row count estimate. If you look at more than just the cost estimate on the exec plans, you'll notice that the row count estimations of the CTE are completely inaccurate.
Estimated rows: 2
Actual rows: 10000
That's going to really mess up those costs and as a results the estimates.
I need to do more reading and investigation but, from what I remember, a recursive CTE is typically costed very badly estimated by the optimiser.
I'm just wondering why Jeff did this approach.
Because it's incredibly fast. Also that approach is widely used. If I recall, I've seen Itzik Ben-Gan use it, so it's not just Jeff.
Gail is correct. If you look at the comments in the header you'll see that the code is a result of a whole lot of folks input (including Ben-Gan's) over a whole lot of time.
Gail is also spot on about recursive CTE's. The only thing that get's measured for the estimated and actual execution plans in a recursive CTE is the first iteration. That's what makes it look so fast. The plan is for only one iteration.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 27, 2010 at 6:43 pm
ColdCoffee (8/27/2010)
Quatrei.X (8/27/2010)
Question 1:Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)
Jeff Moden's CTE in his dbo.DelimitedSplit8K
WITH
E1(N) AS ( --=== Create Ten 1s
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
SELECT N FROM cteTally
The CTE I was using to generate numbers
;WITH CTE (ctr1) AS (
SELECT 1
UNION ALL
SELECT ctr1+1 FROM CTE WHERE ctr1<10000
)
SELECT DISTINCT ctr1 FROM CTE c1
OPTION (MAXRECURSION 10000)
Jeff's Tally CTE beats your Recursive CTE BY quite a margin, Quatrei.. Here are my test rig results after five trials running both of the above queries in separate windows.
-- Jeff Moden's Tally:
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 5 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
-- Quatrie's CTE
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 151 ms.
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 154 ms.
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 156 ms.
SQL Server Execution Times:
CPU time = 141 ms, elapsed time = 150 ms.
SQL Server Execution Times:
CPU time = 140 ms, elapsed time = 149 ms.
Look at the CPU time the recursive CTE takes ; this is just for 10000 rows, what if you are generating > 1 million rows..hmmm....
both have the same records
I tried running both codes in one script...
Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)
Actual Execution Plan will "also" , at times, lie...There are lot of things the code you posted do.. the codes will produce the results in the "Results" pane, so that will also be included in the query cost..hmmm...Hope this clears the air..
Exactly... now take those time differences and multiply their usage by something as simple as 10,000 rows (a very small "batch", actually)... the differece between taking 15ms and 150ms per row suddenly becomes the difference between 150 seconds (2.5 minutes) and 1,500 seconds (25 minutes).
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 49 total)
You must be logged in to reply to this topic. Login to reply