Technical Article

Delimited String Parsing Functions - Big2D set

,

Delimited String Parsing Functions - Big2D set
by Jesse Roberge - YeshuaAgapao@gmail.com
Update: Added robustness for NULL inputs and made it return no rows on blank inputs.

Feed it large strings of double-delimited horizontal data and it returns it back as a non-pivoted vertical table with a 2-diemensional star schema.
The Big2D function set supports more than 8000 character delimited strings, but the individual elements must be 8000 characters or less.
If you like performance you don't need to process delimited strings over 8000 characters, then use the 2D function set instead of the Big2D function set.
Requires a table of numbers. These functions expect it to be called 'Counter' in the same database that you save these functions to.
Search for 'Counter table (table of numbers) setter-upper for SQL Server 2005' or Counter table (table of numbers) setter-upper for SQL Server 2000' if you need a script to set this up for you.
SQL Server 2005 only.

Variants:
Array Has array position index and value data is not casted.
Table No array position index and value data is not casted.
IntArray Has array position index and value data is casted to int.
IntTable No array position index and value data is casted to int.
In the Big2D delimiter function set, the table variants have some performance gain over the array variants, but are not very useful except in joins.

Usage:
SELECT * FROM fn_DelimitToArray_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntArray_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntTable_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToTable_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit

Copyright:
Licensed under the L-GPL - a weak copyleft license - you are permitted to use this as a component of a proprietary database and call this from proprietary software.
Copyleft lets you do anything you want except plagarize, conceal the source, or prohibit copying & re-distribution of this script/proc.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.

see <http://www.fsf.org/licensing/licenses/lgpl.html> for the license text.

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

/*
Delimited String Parsing Functions - Big2D set
by Jesse Roberge - YeshuaAgapao@gmail.com
Update: Added robustness for NULL inputs and made it return no rows on blank inputs.

Feed it large strings of double-delimited horizontal data and it returns it back as a non-pivoted vertical table with a 2-diemensional star schema.
The Big2D function set supports more than 8000 character delimited strings, but the individual elements must be 8000 characters or less.
If you like performance you don't need to process delimited strings over 8000 characters, then use the 2D function set instead of the Big2D function set.
Requires a table of numbers.  These functions expect it to be called 'Counter' in the same database that you save these functions to.
Search for 'Counter table (table of numbers) setter-upper for SQL Server 2005' or Counter table (table of numbers) setter-upper for SQL Server 2000' if you need a script to set this up for you.
SQL Server 2005 only.

Variants:
ArrayHas array position index and value data is not casted.
TableNo array position index and value data is not casted.
IntArrayHas array position index and value data is casted to int.
IntTableNo array position index and value data is casted to int.
In the Big2D delimiter function set, the table variants have some performance gain over the array variants, but are not very useful except in joins.

Usage:
SELECT * FROM fn_DelimitToArray_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntArray_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntTable_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToTable_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit

Copyright:
Licensed under the L-GPL - a weak copyleft license - you are permitted to use this as a component of a proprietary database and call this from proprietary software.
Copyleft lets you do anything you want except plagarize, conceal the source, or prohibit copying & re-distribution of this script/proc.

This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Lesser General Public License as
    published by the Free Software Foundation, either version 3 of the
    License, or (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Lesser General Public License for more details.

    see <http://www.fsf.org/licensing/licenses/lgpl.html> for the license text.
*/
--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToArray_Big') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToArray_Big
GO

CREATE FUNCTION dbo.fn_DelimitToArray_Big2D
(
@String text,
@Delimiter VarChar(1),
@Delimiter2 VarChar(1)
)
RETURNS @T TABLE
(
RowPos int NOT NULL,
ColPos int NOT NULL,
Value VarChar(8000) NOT NULL
)
AS

BEGIN

DECLARE @Slices Table
(
Slice VarChar(8000) NOT NULL,
CumulativeElementCount int NOT NULL
)

DECLARE @Slice VarChar(8000)
DECLARE @TextPos int
DECLARE @MaxLength int
DECLARE @StopPos int
DECLARE @StringLength int
DECLARE @CumulativeElementCount int
SELECT @TextPos = 1, @MaxLength = 8000 - 2, @CumulativeElementCount=0
SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

WHILE @TextPos < @StringLength
BEGIN
SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter, @CumulativeElementCount)

SELECT @CumulativeElementCount=@CumulativeElementCount+LEN(@Slice)-LEN(REPLACE(@Slice, @Delimiter, ''))
SELECT @TextPos = @TextPos + @StopPos + 1
END
IF @StringLength>0-@MaxLength INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter, @CumulativeElementCount);

INSERT INTO @T (RowPos, ColPos, Value)
SELECT Counter1st.Pos AS RowPos, Counter2nd.Pos AS ColPos, Counter2nd.Value
FROM
(
SELECT
PK_CountID - LEN(REPLACE(LEFT(Slices.Slice, PK_CountID-1), @Delimiter, '')) + Slices.CumulativeElementCount AS Pos,
SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
FROM
dbo.Counter WITH (NOLOCK)
JOIN @Slices AS Slices ON
Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
) AS Counter1st
CROSS APPLY (
SELECT
PK_CountID - LEN(REPLACE(LEFT(Counter1st.Value, PK_CountID-1), @Delimiter2, '')) AS Pos,
SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
FROM dbo.counter WITH (NOLOCK)
WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
) AS Counter2nd
RETURN

END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToIntArray_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToIntArray_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToIntArray_Big2D
(
@String text,
@Delimiter VarChar(1),
@Delimiter2 VarChar(1)
)
RETURNS @T TABLE
(
RowPos int NOT NULL,
ColPos int NOT NULL,
PK_IntID int NOT NULL
)
AS

BEGIN

DECLARE @Slices Table
(
Slice VarChar(8000) NOT NULL,
CumulativeElementCount int NOT NULL
)

DECLARE @Slice VarChar(8000)
DECLARE @TextPos int
DECLARE @MaxLength int
DECLARE @StopPos int
DECLARE @StringLength int
DECLARE @CumulativeElementCount int
SELECT @TextPos = 1, @MaxLength = 8000 - 2, @CumulativeElementCount=0
SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

WHILE @TextPos < @StringLength
BEGIN
SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter, @CumulativeElementCount)

SELECT @CumulativeElementCount=@CumulativeElementCount+LEN(@Slice)-LEN(REPLACE(@Slice, @Delimiter, ''))
SELECT @TextPos = @TextPos + @StopPos + 1
END
IF @StringLength>0-@MaxLength INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter, @CumulativeElementCount);

INSERT INTO @T (RowPos, ColPos, PK_IntID)
SELECT Counter1st.Pos AS RowPos, Counter2nd.Pos AS ColPos, CONVERT(int, Counter2nd.Value) AS PK_IntID
FROM
(
SELECT
PK_CountID - LEN(REPLACE(LEFT(Slices.Slice, PK_CountID-1), @Delimiter, '')) + Slices.CumulativeElementCount AS Pos,
SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
FROM
dbo.Counter WITH (NOLOCK)
JOIN @Slices AS Slices ON
Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
) AS Counter1st
CROSS APPLY (
SELECT
PK_CountID - LEN(REPLACE(LEFT(Counter1st.Value, PK_CountID-1), @Delimiter2, '')) AS Pos,
SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
FROM dbo.counter WITH (NOLOCK)
WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
) AS Counter2nd
RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToIntTable_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToIntTable_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToIntTable_Big2D
(
@String text,
@Delimiter VarChar(1),
@Delimiter2 VarChar(1)
)
RETURNS @T TABLE
(
PK_IntID int NOT NULL
)
AS

BEGIN

DECLARE @Slices Table
(
Slice VarChar(8000) NOT NULL
)

DECLARE @Slice VarChar(8000)
DECLARE @TextPos int
DECLARE @MaxLength int
DECLARE @StopPos int
DECLARE @StringLength int
SELECT @TextPos = 1, @MaxLength = 8000 - 2
SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

WHILE @TextPos < @StringLength
BEGIN
SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

INSERT INTO @Slices (Slice) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter)

SELECT @TextPos = @TextPos + @StopPos + 1
END
IF @StringLength>0-@MaxLength INSERT INTO @Slices (slice) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter);

INSERT INTO @T (PK_IntID)
SELECT CONVERT(int, Counter2nd.Value) AS PK_IntID
FROM
(
SELECT
SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
FROM
dbo.Counter WITH (NOLOCK)
JOIN @Slices AS Slices ON
Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
) AS Counter1st
CROSS APPLY (
SELECT
SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
FROM dbo.counter WITH (NOLOCK)
WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
) AS Counter2nd
RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToTable_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToTable_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToTable_Big2D
(
@String text,
@Delimiter VarChar(1),
@Delimiter2 VarChar(1)
)
RETURNS @T TABLE
(
Value VarChar(8000) NOT NULL
)
AS

BEGIN

DECLARE @Slices Table
(
Slice VarChar(8000) NOT NULL
)

DECLARE @Slice VarChar(8000)
DECLARE @TextPos int
DECLARE @MaxLength int
DECLARE @StopPos int
DECLARE @StringLength int
SELECT @TextPos = 1, @MaxLength = 8000 - 2
SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

WHILE @TextPos < @StringLength
BEGIN
SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

INSERT INTO @Slices (Slice) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter)

SELECT @TextPos = @TextPos + @StopPos + 1
END
IF @StringLength>0-@MaxLength INSERT INTO @Slices (slice) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter);

INSERT INTO @T (Value)
SELECT Counter2nd.Value
FROM
(
SELECT
SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
FROM
dbo.Counter WITH (NOLOCK)
JOIN @Slices AS Slices ON
Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
) AS Counter1st
CROSS APPLY (
SELECT
SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
FROM dbo.counter WITH (NOLOCK)
WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
) AS Counter2nd
RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

Rate

1.5 (2)

You rated this post out of 5. Change rating

Share

Share

Rate

1.5 (2)

You rated this post out of 5. Change rating