March 11, 2009 at 10:20 am
ALTER FUNCTION [dbo].[Split_Dummy](@string varchar(max), @delimiter char(1))
RETURNS @result TABLE (ID bigint,Items varchar(max))
AS
BEGIN
WITH Pieces(ID, Start, Stop) AS
(
SELECT 1, 1, CHARINDEX(@delimiter, @string) WHERE @string IS NOT NULL AND ltrim(rtrim(@string))<>''
UNION ALL
SELECT ID + 1, Stop + 1, CHARINDEX(@delimiter, @string, Stop + 1) FROM Pieces WHERE Stop > 0
)
INSERT INTO @result (ID,Items)
SELECT ID, SUBSTRING(@string, Start, CASE WHEN Stop > 0 THEN Stop-Start ELSE 2147483647 END) AS Items
FROM Pieces OPTION (MAXRECURSION 32767);
RETURN
END
GO
there is some issie on the code, could somone help with the same. CTE is being sued to simulate splitting a string with a delimeter.
siddarth
March 11, 2009 at 2:48 pm
I have not explanation why the function causes the error message when creating this function. In my opinion it is some kind of bug. You can get around this bug by casting the value of column stop as int in the code:
ALTER FUNCTION [dbo].[Split_Dummy](@string varchar(max), @delimiter char(1))
RETURNS @result TABLE (ID bigint,Items varchar(max))
AS
BEGIN
WITH Pieces(ID, Start, Stop) AS
(
SELECT 1, 1, cast(CHARINDEX(@delimiter, @string) as int) WHERE @string IS NOT NULL AND ltrim(rtrim(@string))<>''
UNION ALL
SELECT ID + 1, Stop + 1, cast(CHARINDEX(@delimiter, @string, Stop + 1) as int) FROM Pieces WHERE Stop > 0
)
INSERT INTO @result (ID,Items)
SELECT ID, SUBSTRING(@string, Start, CASE WHEN Stop > 0 THEN Stop-Start ELSE 2147483647 END) AS Items
FROM Pieces OPTION (MAXRECURSION 32767);
RETURN
END
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
March 11, 2009 at 3:26 pm
Please share with us what error you are getting, the table schema(s), and perhaps a sample of data.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 11, 2009 at 3:43 pm
Please confirm that the purpose of the function is to return a table of items from a delimited string.
For example, given the string "'A','B','C','D','E'" return the following:
1 A
2 B
3 C
4 D
5 E
March 11, 2009 at 3:51 pm
Bob Hovious (3/11/2009)
Please share with us what error you are getting, the table schema(s), and perhaps a sample of data.
The function doesn’t work with table. It works only with the input parameters that it gets, so there is no table definition or data that is relevant. If I use the original poster’s code, I get the fallowing error message:
Msg 240, Level 16, State 1, Procedure Split_Dummy, Line 5
Types don't match between the anchor and the recursive part in column "Start" of recursive query "Pieces".
According to this message the union query in the common table expression has different data types in both select statements. It complains about the third column that gets it’s value from charindex function (at both select statements). If I just use the cast function to cast the results of the charindex as int, I get no error message.
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
March 11, 2009 at 3:58 pm
How about trying the following code:
create function dbo.DelimitedSplit (
@pString varchar(max),
@pDelimiter char(1)
)
returns table
as
return
with
a1 as (select 1 as N union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1),
a2 as (select
1 as N
from
a1 as a
cross join a1 as b),
a3 as (select
1 as N
from
a2 as a
cross join a2 as b),
a4 as (select
1 as N
from
a3 as a
cross join a2 as b),
Tally as (select
row_number() over (order by N) as N
from
a4),
ItemSplit(
ItemOrder,
Item
) as (
SELECT
N,
SUBSTRING(@pDelimiter + @pString + @pDelimiter,N + 1,CHARINDEX(',',@pDelimiter + @pString + @pDelimiter,N + 1) - N - 1)
FROM
Tally
WHERE
N < LEN(@pDelimiter + @pString + @pDelimiter)
AND SUBSTRING(@pDelimiter + @pString + @pDelimiter,N,1) = ',' --Notice how we find the comma
)
select
row_number() over (order by ItemOrder) as ItemID,
Item
from
ItemSplit
March 11, 2009 at 4:05 pm
Listen to Lynn. A recursive CTEs is going to run slowly.
When you DO use recursive CTE, as in any Union ALL, the number and type of columns must be the same for the first query (the anchor) and the recursive query (the one following the UNION ALL).
That's why Adi Cohn told you to cast the [Stop] column as an integer. The anchor uses an integer in that column.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 4:33 am
I am not able to complie the function because of Varchar(max) for @string parameter.
March 12, 2009 at 5:20 am
siddartha pal (3/12/2009)
I am not able to complie the function because of Varchar(max) for @string parameter.
Not sure why that would give you a problem, you are using varchar(max) in your original code. Please post the code you are trying to use and the exact error message you are getting with the code. It is possible something is just missing in translation.
March 12, 2009 at 8:21 am
there is no table definition or data that is relevant
You are absolutely correct. When asking for more information, I just asked for complete text of errors, table schemas and sample data without thinking. Sorry.
Here is the fastest parser code I've been able to find. It works with a varchar(max) input, and runs much faster than a recursive CTE. If you are unfamiliar with numbers or tally tables, search "tally" at SSC to find Jeff Moden's classic article on the subject. I prefer to build a physical table to have available, or there are fast techniques to generate one as a cte, as in Lynn's example.
-- CODE TO PARSE A STRING INTO DISCRETE ELEMENTS
DECLARE @workTable table (element varchar(max)) -- only needed if we want elements to persist beyond a single query
DECLARE @input varchar(max)
DECLARE @delimiter varchar(10)
SET @input = 'L1,L2,L3,L5,L7,L6,L8,L4'
SET @delimiter = ','
SET @input = @delimiter+@input+@delimiter
-- using tally instead of XML because it parses quicker (in my time trials anyway)
;with cte (element) as
(select substring(@input,N+1,charindex(@delimiter,@input,N+1)-(N+1)) as N
from tally
where substring(@input,N,1) = @delimiter
and N < len(@input)
)
insert into @worktable
select element from cte
select * from @worktable
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 8:26 am
Bob Hovious (3/12/2009)
there is no table definition or data that is relevant
You are absolutely correct. When asking for more information, I just asked for complete text of errors, table schemas and sample data without thinking. Sorry.
Here is the fastest parser code I've been able to find. It works with a varchar(max) input, and runs MUCH faster than a recursive CTE. If you are unfamiliar with numbers or tally tables, search "tally" at SSC to find Jeff Moden's classic article on the subject.
-- CODE TO PARSE A STRING INTO DISCRETE ELEMENTS
DECLARE @workTable table (element varchar(max)) -- only needed if we want elements to persist beyond a single query
DECLARE @input varchar(max)
DECLARE @delimiter varchar(10)
SET @input = 'L1,L2,L3,L5,L7,L6,L8,L4'
SET @delimiter = ','
SET @input = @delimiter+@input+@delimiter
-- using tally instead of XML because it parses quicker (in my time trials anyway)
;with cte (element) as
(select substring(@input,N+1,charindex(@delimiter,@input,N+1)-(N+1)) as N
from tally
where substring(@input,N,1) = @delimiter
and N < len(@input)
)
insert into @worktable
select element from cte
select * from @worktable
Looks sorta like my code but mine has the Tally table as a CTE. I'm waiting to hear back what the actual error message the OP got as his last post indicated some kind of problem with varchar(max).
March 12, 2009 at 8:31 am
That's because it *is* a variation on your code, Lynn. 😎
I just thought I'd come back with it again, because I can't find anything faster.
I notice you start out with 10 rows before you start the cross joins inside your CTE. Does that give you any advantages over Ben-Gan's binary technique?
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1),--2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
select top 100000 * from tally
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 8:41 am
Really comes down to six and half dozen the other. Personal experience on my system at home had fewer larger cross joins ran faster when I wanted millions of values. Not really sure why, so I try to get there quicker.
Could just be me and my perceptions. 😛
March 12, 2009 at 9:00 am
That's good enough reason for me. 😀
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 12, 2009 at 9:19 am
Just ran this test:
set statistics time off;
set nocount on;
declare @N int;
set statistics time on;
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT top 1000000 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
select @N = n from tally;
with
a1 as (select 1 as N union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1),
a2 as (select
1 as N
from
a1 as a
cross join a1 as b),
a3 as (select
1 as N
from
a2 as a
cross join a2 as b),
a4 as (select
1 as N
from
a3 as a
cross join a2 as b),
Tally as (select
row_number() over (order by N) as N
from
a4)
select @n = n from tally;
set statistics time off;
On my desktop machine, the first one, powers of 2, took 359 milliseconds CPU and 356 milliseconds total. Second one, powers of 10, took 281 CPU and 275 total.
The variable assignment gets rid of any screen output other than the time stats.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply