January 16, 2013 at 11:42 am
aochss (1/16/2013)
I would not expect a record back if the string being sent in is NULL, a single space or all spaces. In my case, I have records where the string being split is often blank and I would want no records coming back from the CROSS APPLY. I could filter them out using in the WHERE clause. I fixed this by adding LTRIM and RTRIM to the source string in the function.Also, some of my "not so good" data also had multiple spaces in between the values causing numerous records returned that were blank.
For example:
dbo.DelimitedSplit8k('123 467 789', ' ') -- There are two spaces between 467 and 789
ItemNumber Item
1 123
2 467
3 <-- Empty String
4 789
I modified the function to give an option for removing extra spaces after the delimiter (like '123, 345, 567) and then as you did to your version to check for null/blank input as well as null/blank rows in the output. To do that I had to change the function into a multi-statement tvf.
Due to the additional checking for nulls/blanks and removing blank rows from the returned table I figured efficiency would be seriously compromised. So I tested this revision against the original and it seriously under performs. In the results below the first run of DelimitedSplit8k_NULLS sets the parameter "@bRemoveSpacesAfterDelimiter" to 0 and the 2nd run set it to 1.
I can see where nulls/blanks could be an issue, but I think it would be better to filter out such rows AFTER applying any CROSS APPLYs using the original function as opposed to trying to do the filtering within the function (though I didn't specifically test that). Maybe someone else can come up with a better way of getting rid of nulls/blanks without compromising the efficiency of DelimitedSplit8K?
Test Results:
RowNumSplitterNameNumberOfRowsNumberOfElementsMinElementLengthMaxElementLengthDurationMinLengthAvgLengthMaxLength
1DelimitedSplit8K100011100.013001510
2DelimitedSplit8K_NULLS100011100.293001510
3DelimitedSplit8K_NULLS100011100.300001510
4DelimitedSplit8K100021100.0160031121
5DelimitedSplit8K_NULLS100021100.3600031121
6DelimitedSplit8K_NULLS100021100.2400031121
7DelimitedSplit8K100041100.0330092540
8DelimitedSplit8K_NULLS100041100.4030092540
9DelimitedSplit8K_NULLS100041100.3100092540
10DelimitedSplit8K100081100.05600295074
11DelimitedSplit8K_NULLS100081100.51300295074
12DelimitedSplit8K_NULLS100081100.58600295074
13DelimitedSplit8K1000161100.1100072102137
14DelimitedSplit8K_NULLS1000161100.8430072102137
15DelimitedSplit8K_NULLS1000161100.6960072102137
16DelimitedSplit8K1000321100.24300148207251
17DelimitedSplit8K_NULLS1000321101.37600148207251
18DelimitedSplit8K_NULLS1000321101.23600148207251
19DelimitedSplit8K1000641100.40300328415482
20DelimitedSplit8K_NULLS1000641102.48600328415482
21DelimitedSplit8K_NULLS1000641102.29000328415482
22DelimitedSplit8K10001281100.82600734832942
23DelimitedSplit8K_NULLS10001281104.69600734832942
24DelimitedSplit8K_NULLS10001281104.46000734832942
25DelimitedSplit8K10002561101.62000152816611815
26DelimitedSplit8K_NULLS10002561108.88000152816611815
27DelimitedSplit8K_NULLS10002561108.87300152816611815
28DelimitedSplit8K10005121103.21000312433223493
29DelimitedSplit8K_NULLS100051211018.02600312433223493
30DelimitedSplit8K_NULLS100051211017.05000312433223493
31DelimitedSplit8K100011501107.44600715874787798
32DelimitedSplit8K_NULLS1000115011040.46000715874787798
33DelimitedSplit8K_NULLS1000115011037.25000715874787798
34DelimitedSplit8K1000110200.01600101420
35DelimitedSplit8K_NULLS1000110200.20600101420
36DelimitedSplit8K_NULLS1000110200.20600101420
37DelimitedSplit8K1000210200.02000213041
38DelimitedSplit8K_NULLS1000210200.30000213041
39DelimitedSplit8K_NULLS1000210200.30600213041
40DelimitedSplit8K1000410200.04600466281
41DelimitedSplit8K_NULLS1000410200.46000466281
42DelimitedSplit8K_NULLS1000410200.46000466281
43DelimitedSplit8K1000810200.09600102126155
44DelimitedSplit8K_NULLS1000810200.74000102126155
45DelimitedSplit8K_NULLS1000810200.75000102126155
46DelimitedSplit8K10001610200.18300216254292
47DelimitedSplit8K_NULLS10001610201.32000216254292
48DelimitedSplit8K_NULLS10001610201.32300216254292
49DelimitedSplit8K10003210200.36000458510586
50DelimitedSplit8K_NULLS10003210202.52300458510586
51DelimitedSplit8K_NULLS10003210202.49600458510586
52DelimitedSplit8K10006410200.7400094510221097
53DelimitedSplit8K_NULLS10006410204.7560094510221097
54DelimitedSplit8K_NULLS10006410204.7900094510221097
55DelimitedSplit8K100012810201.44000194420472156
56DelimitedSplit8K_NULLS100012810209.38300194420472156
57DelimitedSplit8K_NULLS100012810209.44600194420472156
58DelimitedSplit8K100025610202.88300390340964254
59DelimitedSplit8K_NULLS1000256102018.67000390340964254
60DelimitedSplit8K_NULLS1000256102018.86600390340964254
61DelimitedSplit8K100048010205.45600743276807881
62DelimitedSplit8K_NULLS1000480102034.67600743276807881
63DelimitedSplit8K_NULLS1000480102035.24600743276807881
64DelimitedSplit8K1000120300.01300202430
65DelimitedSplit8K_NULLS1000120300.24000202430
66DelimitedSplit8K_NULLS1000120300.24300202430
67DelimitedSplit8K1000220300.02300415161
68DelimitedSplit8K_NULLS1000220300.41300415161
69DelimitedSplit8K_NULLS1000220300.40300415161
70DelimitedSplit8K1000420300.0700086103121
71DelimitedSplit8K_NULLS1000420300.6600086103121
72DelimitedSplit8K_NULLS1000420300.6600086103121
73DelimitedSplit8K1000820300.13600180206232
74DelimitedSplit8K_NULLS1000820301.15600180206232
75DelimitedSplit8K_NULLS1000820301.19300180206232
76DelimitedSplit8K10001620300.27000379415456
77DelimitedSplit8K_NULLS10001620302.17300379415456
78DelimitedSplit8K_NULLS10001620302.21600379415456
79DelimitedSplit8K10003220300.52600774830892
80DelimitedSplit8K_NULLS10003220304.18600774830892
81DelimitedSplit8K_NULLS10003220304.33000774830892
82DelimitedSplit8K10006420301.05600158216631747
83DelimitedSplit8K_NULLS10006420308.30000158216631747
84DelimitedSplit8K_NULLS10006420308.37300158216631747
85DelimitedSplit8K100012820302.19300320833253442
86DelimitedSplit8K_NULLS1000128203016.45300320833253442
87DelimitedSplit8K_NULLS1000128203016.68600320833253442
88DelimitedSplit8K100025620304.22000648466576834
89DelimitedSplit8K_NULLS1000256203032.53000648466576834
90DelimitedSplit8K_NULLS1000256203032.85000648466576834
91DelimitedSplit8K100029020304.76000736975397712
92DelimitedSplit8K_NULLS1000290203037.29000736975397712
93DelimitedSplit8K_NULLS1000290203038.92000736975397712
94DelimitedSplit8K1000130400.01300303440
95DelimitedSplit8K_NULLS1000130400.27300303440
96DelimitedSplit8K_NULLS1000130400.28000303440
97DelimitedSplit8K1000230400.02600617081
98DelimitedSplit8K_NULLS1000230400.50300617081
99DelimitedSplit8K_NULLS1000230400.51000617081
100DelimitedSplit8K1000430400.09300124142158
101DelimitedSplit8K_NULLS1000430400.89300124142158
102DelimitedSplit8K_NULLS1000430400.93600124142158
103DelimitedSplit8K1000830400.18600262286314
104DelimitedSplit8K_NULLS1000830401.66300262286314
105DelimitedSplit8K_NULLS1000830401.67600262286314
106DelimitedSplit8K10001630400.35000529574613
107DelimitedSplit8K_NULLS10001630403.19300529574613
108DelimitedSplit8K_NULLS10001630403.25000529574613
109DelimitedSplit8K10003230400.71300109711501222
110DelimitedSplit8K_NULLS10003230407.12600109711501222
111DelimitedSplit8K_NULLS10003230406.80300109711501222
112DelimitedSplit8K10006430401.51600222523022376
113DelimitedSplit8K_NULLS100064304012.73300222523022376
114DelimitedSplit8K_NULLS100064304012.53000222523022376
115DelimitedSplit8K100012830402.85600448746044706
116DelimitedSplit8K_NULLS1000128304024.68000448746044706
117DelimitedSplit8K_NULLS1000128304024.94300448746044706
118DelimitedSplit8K100021030404.99600737875587705
119DelimitedSplit8K_NULLS1000210304041.28300737875587705
120DelimitedSplit8K_NULLS1000210304041.99300737875587705
121DelimitedSplit8K1000140500.01300404450
122DelimitedSplit8K_NULLS1000140500.33300404450
123DelimitedSplit8K_NULLS1000140500.34000404450
124DelimitedSplit8K1000240500.030008191101
125DelimitedSplit8K_NULLS1000240500.656008191101
126DelimitedSplit8K_NULLS1000240500.680008191101
127DelimitedSplit8K1000440500.11600166183199
128DelimitedSplit8K_NULLS1000440501.19600166183199
129DelimitedSplit8K_NULLS1000440501.22600166183199
130DelimitedSplit8K1000840500.22300341367392
131DelimitedSplit8K_NULLS1000840502.44000341367392
132DelimitedSplit8K_NULLS1000840502.31300341367392
133DelimitedSplit8K10001640500.47600700735777
134DelimitedSplit8K_NULLS10001640504.64300700735777
135DelimitedSplit8K_NULLS10001640504.69600700735777
136DelimitedSplit8K10003240500.91000140114711531
137DelimitedSplit8K_NULLS10003240509.08000140114711531
138DelimitedSplit8K_NULLS10003240509.06600140114711531
139DelimitedSplit8K10006440501.77600286029453026
140DelimitedSplit8K_NULLS100064405017.82300286029453026
141DelimitedSplit8K_NULLS100064405017.84600286029453026
142DelimitedSplit8K100012840503.54300578558875997
143DelimitedSplit8K_NULLS1000128405035.12000578558875997
144DelimitedSplit8K_NULLS1000128405036.19600578558875997
145DelimitedSplit8K100016540504.55600747375897743
146DelimitedSplit8K_NULLS1000165405045.69600747375897743
147DelimitedSplit8K_NULLS1000165405045.45300747375897743
Here's the revised function I tested. I won't claim that it is the best variation--just a quick-and-dirty trial run for comparison purposes.
/*
--Some test strings for manual testing
SELECT * FROM dbo.DelimitedSplit8K_NULLS(' ',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(NULL,' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(' ',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(NULL,',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123, 467, 789',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123, 467, 789',',',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123,467,789',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123,467,789',',',1)
*/
CREATE FUNCTION [dbo].[DelimitedSplit8K_NULLS]
(
@pString VARCHAR(8000)
,@pDelimiter CHAR(1)
,@bRemoveSpaceAfterDelimiter BIT
)
RETURNS @Split TABLE
(
[ItemNumber] INT
,[Item] VARCHAR(50)
)
WITH SCHEMABINDING
AS
BEGIN
SET @pString = NULLIF(RTRIM(LTRIM(@pString)),'')
IF @bRemoveSpaceAfterDelimiter = 1
SET @pString = REPLACE(@pString,@pDelimiter+' ',@pDelimiter)
IF @pString IS NULL
RETURN
ELSE
WITH E1(N)
AS (
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
), --10E+1 or 10 rows
E2(N)
AS (
SELECT
1
FROM
E1 a
,E1 b
), --10E+2 or 100 rows
E4(N)
AS (
SELECT
1
FROM
E2 a
,E2 b
), --10E+4 or 10,000 rows max
cteTally(N)
AS (
SELECT TOP (ISNULL(DATALENGTH(@pString),0))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E4
),
cteStart(N1)
AS (
SELECT
1
UNION ALL
SELECT
t.N + 1
FROM
cteTally t
WHERE
SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1)
AS (
SELECT
s.N1
,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0) - s.N1,8000)
FROM
cteStart s
)
INSERT INTO @Split
SELECT
ItemNumber = ROW_NUMBER() OVER (ORDER BY l.N1)
,Item = SUBSTRING(@pString,l.N1,l.L1)
FROM
cteLen l
WHERE
NULLIF(RTRIM(LTRIM(SUBSTRING(@pString,l.N1,l.L1))),'') IS NOT NULL
RETURN
END
January 16, 2013 at 1:40 pm
Due to the additional checking for nulls/blanks and removing blank rows from the returned table I figured efficiency would be seriously compromised. So I tested this revision against the original and it seriously under performs. In the results below the first run of DelimitedSplit8k_NULLS sets the parameter "@bRemoveSpacesAfterDelimiter" to 0 and the 2nd run set it to 1.
Steve,
I came up with the same results after doing almost exactly the same thing.
In fact, I found that even after taking out the cleaning of the duplicate characters and check for null, the function with the return table declaration is a lot slower than the original single statement function. I wonder if SQL Server is using a different memory management scheme.
I guess having the WHERE clause in the CROSS APPLY do the work gives the next person looking at my code an idea of what I am trying to do.
Either way it has been a fun afternoon of testing the assumptions and changes. This is a perfect example of why I still love programming after over 20 years of doing this stuff. Going through Jeff's code line by line was quite a revelation in how to use CTE, NULLIF and "Tally" tables in ways I never thought of.
Anton
January 16, 2013 at 9:38 pm
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune. You basically get a cursor under the covers because the mTVF cannot be truly in-lined with the query/queryplan. The optimization process is shot to hell too (although that won't come into play here). Voids parallelism also IIRC.
Try doing a statement-level completion profiler trace while running each type of code. CAUTION: beware statement completed profiling on production systems!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 16, 2013 at 11:02 pm
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune. You basically get a cursor under the covers
What was startling for me was not that functions acted like an implicit cursor (I'd always known that) but rather that you could create inline functions that don't!
January 17, 2013 at 3:01 am
aochss (1/16/2013)
The performance of this is amazing, as is the thought behind the code. These kind of posts really make you think and learn a whole new set of features in SQLThanks Jeff.
A couple of questions (sorry if it has been addressed - I didn't see it mentioned in the thread):
I would not expect a record back if the string being sent in is NULL, a single space or all spaces. In my case, I have records where the string being split is often blank and I would want no records coming back from the CROSS APPLY. I could filter them out using in the WHERE clause.
The best indeed would be to filter them out in the WHERE clause with something like WHERE LEN(StringColumn) > 0. Hence, the function calls would be for only those rows which are neither NULL nor containing only space/s.
But this can also be done in the function itself as I have tried to do it.
Also, some of my "not so good" data also had multiple spaces in between the values causing numerous records returned that were blank.For example:
dbo.DelimitedSplit8k('123 467 789', ' ') -- There are two spaces between 467 and 789
ItemNumber Item
1 123
2 467
3 <-- Empty String
4 789
Thanks Again,
Anton
Here is my attempt while keeping the function iTVF (I did not test it thoroughly). Only slight changes are done
CREATE FUNCTION [dbo].[DelimitedSplit8K_With_NULL_ZeroLengthString_And_ConsecutiveDelimiter_Filter]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter VARCHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
/*========= USE LEN() FUNCTION TO CHECK WHETHER STRING IS NULL OR CONTAIN ONLY SPACES =======*/
WHERE LEN(@pString) > 0 --FILTER OUT NON_REQUIRED
/*===========================================================================================*/
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1
/*========= USE LEN() FUNCTION TO CHECK WHETHER STRING IS NULL OR CONTAIN ONLY SPACES =======*/
WHERE LEN(@pString) > 0 --FILTER OUT NON_REQUIRED
/*===========================================================================================*/
UNION ALL
SELECT t.N+1 FROM cteTally t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
/*====== --FILTER OUT TWO CONSECUTIVE OCCURENCES OF DELIMITERS ============*/
AND SUBSTRING(@pString,t.N+1,1) <> @pDelimiter
/*=========================================================================*/
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = RTRIM(LTRIM(SUBSTRING(@pString, l.N1, l.L1)))--USE LTRIM AND RTRIM FUNCTION TO OMIT TRAILING SPACES
FROM cteLen l
;
January 17, 2013 at 3:14 am
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune.
Not always true 🙂 It really depends upon several things. For instance, if we have to deal with VARCHAR(max) strings and multiple rows, the mTVF version of the tally table splitter can beat the iTVF version quite comprehensively. Sometimes, mTVF versions can show their worth too 😉
I am not trying to be infamous but sometimes RBAR approach could be handy as well :-D. I hope my Anti-RBAR membership is not cancelled for issuing this statement :hehe:
January 17, 2013 at 3:35 am
Usman Butt (1/17/2013)
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune.Not always true 🙂 It really depends upon several things. For instance, if we have to deal with VARCHAR(max) strings and multiple rows, the mTVF version of the tally table splitter can beat the iTVF version quite comprehensively. Sometimes, mTVF versions can show their worth too 😉
I am not trying to be infamous but sometimes RBAR approach could be handy as well :-D. I hope my Anti-RBAR membership is not cancelled for issuing this statement :hehe:
Multi-line TVF work well when they are called/invoked sparsly or only once as say the logical starting table in your from clause, before other joins take place. They can also work quite well as part of an additional condition in a join. If it is not the only condition and applied using OR logic, it only needs to be invoked when other simpler conditions do not match.
mTVF also allow some control over the indexing of the result table (trough PK and unique constraints). This can speed up certain operations in complex processing as the required indexing on the datamodel might not be there and/or too costly to implement for all records.
What I generaly do is write first as an iTVF and if for some reason it turns out unexpectedly slow or impractical, make a mTVF from it as well with proper indexing to speed up the propblem operation.
January 17, 2013 at 9:49 am
I have no problem whatsoever in using the right tool for the job!! In fact, I make a fair bit of my consulting dollars cleaning up messes from clients who do NOT do that! 😎 Sometimes being "dead meat" IS the better alternative. :w00t:
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 11, 2013 at 12:55 pm
Hello,
Greatt article. I used the function and works great. I have a situation where I need to search for more than 1 character, then split the data. How could I use this function to search for 2 characters. For example, I want to search for semicolon, or, comma (;,), then, split the column if either is found.
February 11, 2013 at 12:59 pm
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
February 11, 2013 at 1:03 pm
ahpitre (2/11/2013)
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
Easiest way I can think of would be to simply use replace.
select *
from YourTable
cross apply dbo.DelimitedSplit8k(replace(YourColumn, ';', ','), ',')
This way you are still splitting on commas but it will work for either character. 😉
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 11, 2013 at 3:26 pm
ahpitre (2/11/2013)
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
Just throw in a replace to change the second character into the desired delimiter. I can't guess what effect this would have on performance of the function.
If you are talking about a two-dimensional array like 'A,B;X,Y;D,E' and you want to split that into two columns, then that's a different problem.
DECLARE @strExample VARCHAR(8000)
SET @strExample = 'A,B;C;D,E,F,G,H'
SELECT
ItemNumber
,Item
FROM
dbo.DelimitedSplit8K(REPLACE(@strExample,';',','),',') AS dsk
February 11, 2013 at 3:28 pm
Sorry Sean...I didn't see that you had already answered with exactly the same idea. 😛
February 11, 2013 at 3:39 pm
Steven Willis (2/11/2013)
Sorry Sean...I didn't see that you had already answered with exactly the same idea. 😛
No worries. Great minds...
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 11, 2013 at 5:13 pm
If you have two delimiters you can change the code slightly:
DECLARE @pString varchar(8000) = '55555;4444,333,22,1';
DECLARE @pDelimiter1 char(1) = ',';
DECLARE @pDelimiter2 char(2) = ';';
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) in (@pDelimiter1, @pDelimiter2)
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
Case when ISNULL(NULLIF(CHARINDEX(@pDelimiter1,@pString,s.N1),0)-s.N1,8000)
< ISNULL(NULLIF(CHARINDEX(@pDelimiter2,@pString,s.N1),0)-s.N1,8000)
then ISNULL(NULLIF(CHARINDEX(@pDelimiter1,@pString,s.N1),0)-s.N1,8000)
else ISNULL(NULLIF(CHARINDEX(@pDelimiter2,@pString,s.N1),0)-s.N1,8000)
end
FROM cteStart s
)
select * from cteLen;
Performance of the second charindex might be similar to using replace on the input string. An alternative would be do join cteStart to itself to derive cteLen. For a single delimiter it is slower than charindex but for multiples it might be better. Certainly for more than 2 delimiters the self join would be simpler code to read.
Viewing 15 posts - 481 through 495 (of 990 total)
You must be logged in to reply to this topic. Login to reply