Retrieve String Between Two Delimiters for Multiple Occurences SQL Server

  • In SQL Server

    How can we get the result set as TESTING1,TESTING2 from below

    DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~'

    Basically i need to get all the substrings which are in between $I10~ and $

  • First welcome to SSC.

    My solution uses a hacked version of Jeff Moden's DelimitedSplit8K (the splitter mentioned in my signature). If you have not worked with a splitter before I highly suggest you read that article.

    My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I just wanted to put out a quick disclaimer here - this is not Jeff Moden code - this is my hacked version of his code that I put together some time ago and am still testing. With that said, here's the splitter:

    CREATE FUNCTION dbo.DelimitedSplit8KD(@pString VARCHAR(8000), @pDelimiter VARCHAR(100))

    /*

    General Information:

    Created by Alan Burstein in September, 2015.

    It is a hacked version of Jeff Moden's delimitedSplit8K which can be found here:

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    Purpose:

    Splits a given string at a given delimiter and return a list of the split elements (items).

    Accepts a varchar input string up to 8,000 characters long and a delimiter (literal) up

    to 100 characters long.

    Usage Example:

    SELECT * FROM dbo.DelimitedSplit8KD('123xxx555xxx999','xxx')

    ItemNumber Item

    ----------- --------

    1 123

    2 555

    3 999

    Revision History:

    Rev 00 - September 2010 - Initial Development: Alan Burstein

    */

    RETURNS TABLE WITH SCHEMABINDING AS RETURN

    WITH E1(N) AS

    (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS

    (--==== This provides the "base" CTE and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM E4

    ),

    cteStart(N1) AS

    (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT 1 UNION ALL

    SELECT t.N+LEN(@pDelimiter)

    FROM cteTally t

    WHERE SUBSTRING(@pString,t.N,LEN(@pDelimiter)) = @pDelimiter

    ),

    cteLen(N1,L1) AS

    (--==== Return start and length (for use in substring)

    SELECT s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)

    FROM cteStart s

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),

    Item = SUBSTRING(@pString, l.N1, l.L1)

    FROM cteLen l;

    GO

    Here's the solution using the splitter:

    DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~';

    SELECT NewString =

    STUFF

    ((

    SELECT ','+SUBSTRING(Item, 1, CHARINDEX('$',item)-1)

    FROM dbo.DelimitedSplit8KD(@MyString,'$I10~')

    WHERE Item <> ''

    FOR XML PATH(''), TYPE

    ).value('.','varchar(256)'),1,1,'')

    Results:

    NewString

    -------------

    TESTING1,TESTING2

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan beat me to the punch... But this is a different approach, w/o using a TVF...

    DECLARE @MyString varchar(256) = '$I10~TESTING1$XYZ$I10~TESTING2$~';

    WITH n (n) AS (

    SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)

    ), Tally (n) AS (

    SELECT TOP (LEN(@MyString))

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM n n1, n n2, n n3, n n4

    )

    SELECT STUFF((

    SELECT ',' + SUBSTRING(@MyString, t.n + 5, CHARINDEX('$', @MyString, t.n + 5) - (t.n + 5))

    FROM Tally t

    WHERE SUBSTRING(@MyString, t.n, 5) = '$I10~'

    FOR XML PATH ('')), 1, 1, '') AS CleanString

    ;

    Results...

    CleanString

    -----------------

    TESTING1,TESTING2

  • Alan.B (10/21/2015)


    My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I

    Just a suggestion and looking at the data, I wonder why a multi-character delimiter is needed at all (unless the OP hasn't show a proper representation of the data). The desired extractions seem to always start with the single character of "~" and end with the single character delimiter of "$".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/25/2015)


    Alan.B (10/21/2015)


    My Hacked version of Jeff's splitter splits a string based on a delimiter that is longer than 1. I

    Just a suggestion and looking at the data, I wonder why a multi-character delimiter is needed at all (unless the OP hasn't show a proper representation of the data). The desired extractions seem to always start with the single character of "~" and end with the single character delimiter of "$".

    Yep, seems that I unnecessarily over-complicated the solution. I don't know why I missed that.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • declare @STR nvarchar(1000), @str1 nvarchar(400), @str2 nvarchar(500), @concat nvarchar(1000), @i int, @j-2 int

    set @STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TESTING4'

    select @str1 = SUBSTRING(@str,CHARINDEX('$I10~',@str) +5,7)

    select @j-2 = ( LEN(@str)

    - LEN( REPLACE ( @STR, @str1, '') )

    ) / LEN(@str1)

    set @concat = ''

    set @i=1

    while @i < = @j-2

    begin

    declare @str3 varchar(20);

    set @str3 = SUBSTRING(@str,CHARINDEX('$I10~',@str) ,13)

    set @str2 = right(@str3,1)

    set @STR = replace (@str,@str3,'')

    set @concat = @concat + @str1 + @str2 + ','

    set @i = @i+1

    end

    select @concat

  • TEJABI (10/26/2015)


    declare @STR nvarchar(1000), @str1 nvarchar(400), @str2 nvarchar(500), @concat nvarchar(1000), @i int, @j-2 int

    set @STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TESTING4'

    select @str1 = SUBSTRING(@str,CHARINDEX('$I10~',@str) +5,7)

    select @j-2 = ( LEN(@str)

    - LEN( REPLACE ( @STR, @str1, '') )

    ) / LEN(@str1)

    set @concat = ''

    set @i=1

    while @i < = @j-2

    begin

    declare @str3 varchar(20);

    set @str3 = SUBSTRING(@str,CHARINDEX('$I10~',@str) ,13)

    set @str2 = right(@str3,1)

    set @STR = replace (@str,@str3,'')

    set @concat = @concat + @str1 + @str2 + ','

    set @i = @i+1

    end

    select @concat

    This will only work if what you're searching for is always 8 characters long and will be pretty slow.

    Note what happens when:

    @STR = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TEST3$I10~TESTING11$I10~TESTING15'

    A faster set-based approach would look like this:

    declare @STR nvarchar(1000) = '$I10~TESTING1$XYZ$I10~TESTING2$ABZ$I10~TEST3$I10~TESTING11$I10~TESTING15';

    SELECT NewString =

    STUFF

    ((SELECT ','+SUBSTRING(item, CHARINDEX('~',item)+1,8000)

    FROM dbo.DelimitedSplit8K(@str, '$')

    WHERE Item LIKE '%~%'

    FOR XML PATH('')),1,1,'');

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply