Retrieve String Between Two Delimiters for Multiple Occurences SQL Server

Question

Retrieve String Between Two Delimiters for Multiple Occurences SQL Server

shivas.sql

Old Hand

Points: 369
More actions
October 21, 2015 at 4:51 pm

#305792

In SQL Server
How can we get the result set as TESTING1,TESTING2 from below
DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~'
Basically i need to get all the substrings which are in between $I10~ and $

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61141 More actions · Answer 1

First welcome to SSC.

My solution uses a hacked version of Jeff Moden's DelimitedSplit8K (the splitter mentioned in my signature). If you have not worked with a splitter before I highly suggest you read that article.

My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I just wanted to put out a quick disclaimer here - this is not Jeff Moden code - this is my hacked version of his code that I put together some time ago and am still testing. With that said, here's the splitter:

CREATE FUNCTION dbo.DelimitedSplit8KD(@pString VARCHAR(8000), @pDelimiter VARCHAR(100))

/*

General Information:

Created by Alan Burstein in September, 2015.

It is a hacked version of Jeff Moden's delimitedSplit8K which can be found here:

http://www.sqlservercentral.com/articles/Tally+Table/72993/

Purpose:

Splits a given string at a given delimiter and return a list of the split elements (items).

Accepts a varchar input string up to 8,000 characters long and a delimiter (literal) up

to 100 characters long.

Usage Example:

SELECT * FROM dbo.DelimitedSplit8KD('123xxx555xxx999','xxx')

ItemNumber Item

----------- --------

1 123

2 555

3 999

Revision History:

Rev 00 - September 2010 - Initial Development: Alan Burstein

*/

RETURNS TABLE WITH SCHEMABINDING AS RETURN

WITH E1(N) AS

(

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), --10E+1 or 10 rows

E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

cteTally(N) AS

(--==== This provides the "base" CTE and limits the number of rows right up front

-- for both a performance gain and prevention of accidental "overruns"

SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM E4

),

cteStart(N1) AS

(--==== This returns N+1 (starting position of each "element" just once for each delimiter)

SELECT 1 UNION ALL

SELECT t.N+LEN(@pDelimiter)

FROM cteTally t

WHERE SUBSTRING(@pString,t.N,LEN(@pDelimiter)) = @pDelimiter

),

cteLen(N1,L1) AS

(--==== Return start and length (for use in substring)

SELECT s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)

FROM cteStart s

)

--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),

Item = SUBSTRING(@pString, l.N1, l.L1)

FROM cteLen l;

GO

Here's the solution using the splitter:

DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~';

SELECT NewString =

STUFF

((

SELECT ','+SUBSTRING(Item, 1, CHARINDEX('$',item)-1)

FROM dbo.DelimitedSplit8KD(@MyString,'$I10~')

WHERE Item <> ''

FOR XML PATH(''), TYPE

).value('.','varchar(256)'),1,1,'')

Results:

NewString

-------------

TESTING1,TESTING2

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jason A. Long SSC-Insane Points: 23786 More actions · Answer 2

Alan beat me to the punch... But this is a different approach, w/o using a TVF...

DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~';

WITH n (n) AS (

SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)

), Tally (n) AS (

SELECT TOP (LEN(@MyString))

ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM n n1, n n2, n n3, n n4

)

SELECT STUFF((

SELECT ',' + SUBSTRING(@MyString, t.n + 5, CHARINDEX('$', @MyString, t.n + 5) - (t.n + 5))

FROM Tally t

WHERE SUBSTRING(@MyString, t.n, 5) = '$I10~'

FOR XML PATH ('')), 1, 1, '') AS CleanString

;

Results...

CleanString

-----------------

TESTING1,TESTING2

Jeff Moden SSC Guru Points: 1004353 More actions · Answer 3

Alan.B (10/21/2015)
My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I

Just a suggestion and looking at the data, I wonder why a multi-character delimiter is needed at all (unless the OP hasn't show a proper representation of the data). The desired extractions seem to always start with the single character of "~" and end with the single character delimiter of "$".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Alan Burstein SSC Guru Points: 61141 More actions · Answer 4

Jeff Moden (10/25/2015)
Alan.B (10/21/2015)
My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I
Just a suggestion and looking at the data, I wonder why a multi-character delimiter is needed at all (unless the OP hasn't show a proper representation of the data). The desired extractions seem to always start with the single character of "~" and end with the single character delimiter of "$".

Yep, seems that I unnecessarily over-complicated the solution. I don't know why I missed that.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

TEJABI Old Hand Points: 368 More actions · Answer 5

declare @STR nvarchar(1000), @str1 nvarchar(400), @str2 nvarchar(500), @concat nvarchar(1000), @i int, @j-2 int

set @STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TESTING4'

select @str1 = SUBSTRING(@str,CHARINDEX('$I10~',@str) +5,7)

select @j-2 = ( LEN(@str)

- LEN( REPLACE ( @STR, @str1, '') )

) / LEN(@str1)

set @concat = ''

set @i=1

while @i < = @j-2

begin

declare @str3 varchar(20);

set @str3 = SUBSTRING(@str,CHARINDEX('$I10~',@str) ,13)

set @str2 = right(@str3,1)

set @STR = replace (@str,@str3,'')

set @concat = @concat + @str1 + @str2 + ','

set @i = @i+1

end

select @concat

Alan Burstein SSC Guru Points: 61141 More actions · Answer 6

TEJABI (10/26/2015)
declare @STR nvarchar(1000), @str1 nvarchar(400), @str2 nvarchar(500), @concat nvarchar(1000), @i int, @j-2 int
set @STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TESTING4'
select @str1 = SUBSTRING(@str,CHARINDEX('$I10~',@str) +5,7)
select @j-2 = ( LEN(@str)
- LEN( REPLACE ( @STR, @str1, '') )
) / LEN(@str1)
set @concat = ''
set @i=1
while @i < = @j-2
begin
declare @str3 varchar(20);
set @str3 = SUBSTRING(@str,CHARINDEX('$I10~',@str) ,13)
set @str2 = right(@str3,1)
set @STR = replace (@str,@str3,'')
set @concat = @concat + @str1 + @str2 + ','
set @i = @i+1
end
select @concat

This will only work if what you're searching for is always 8 characters long and will be pretty slow.

Note what happens when:

@STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TEST3$I10~TESTING11$I10~TESTING15'

A faster set-based approach would look like this:

declare @STR nvarchar(1000) = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TEST3$I10~TESTING11$I10~TESTING15';

SELECT NewString =

STUFF

((SELECT ','+SUBSTRING(item, CHARINDEX('~',item)+1,8000)

FROM dbo.DelimitedSplit8K(@str, '$')

WHERE Item LIKE '%~%'

FOR XML PATH('')),1,1,'');

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001