June 7, 2024 at 10:31 pm
Here's the code I'm running and the results I'm seeing. The @string value was copy/pasted. Thanks btw to Jeff for the splitter code.
drop FUNCTION if exists [dbo].[DelimitedSplitN4K];
go
CREATE FUNCTION [dbo].[DelimitedSplitN4K]
--===== Define I/O parameters
(@pString NVARCHAR(4000), @pDelimiter NCHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover NVARCHAR(4000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString)/2,0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,4000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
go
DECLARE @string NVARCHAR(MAX) = N'"T1","T2","T3","T4","T5" "000055555XYZ",00001,"","","BLUEF, NA"';
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) As RN
,ItemNumber_A = A.ItemNumber, Item_A = A.Item
,ItemNumber_B = B.ItemNumber,Item_B = B.Item
FROM dbo.DelimitedSplitN4K(REPLACE(REPLACE(REPLACE(@string,char(10),N'|'),char(13),N'|'),N' ',N'|'), '|') AS A
CROSS APPLY dbo.DelimitedSplitN4K(REPLACE(REPLACE(REPLACE(REPLACE(A.Item,N'","',N'|'),N',"',N'|'), N'",', N'|'), N'"', N''), N'|') AS B;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 8, 2024 at 12:01 pm
Bump for visibility
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 8, 2024 at 12:26 pm
The problem in Steve's solution is that is doesn't account for possible CR/LF data in the last field. In your data - there is a CR\LF after the comma in the last field.
So - what you have is "BLUEF," + CHAR(13) + CHAR(10) + "NA" - which displays as "BLUEF, NA" where that space is actually the CR\LF characters.
Depending on your data - you may need to modify the replace statements to include separate replacements for CHAR(13), CHAR(10) and CHAR(13)+CHAR(10).
Ok looked into this more and it seems the issue could stem from the code as originally posted. The 3rd (outermost) REPLACE function replaces a double space (not single) with a |. Because there aren't any double spaces the code does not replace (and put a | where it would subsequently be split). If that replacement search term is changed to a single space then the single space in "BLUEF, NA" is replaced with a | and gets split.
copy/paste line (the way I'm running it)
FROM dbo.DelimitedSplitN4K(REPLACE(REPLACE(REPLACE(@string,char(10),N'|'),char(13),N'|'),N' ',N'|'), '|') AS A
Change the double space to a single space (like this) and the splitter returns 3 rows and not 2
FROM dbo.DelimitedSplitN4K(REPLACE(REPLACE(REPLACE(@string,char(10),N'|'),char(13),N'|'),N' ',N'|'), '|') AS A
The code as posted doesn't replace the combination of char(10)+char(13) with a | (which would get split). Sorry for all of the double negatives. It makes this slippery to write about
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
Viewing 3 posts - 16 through 17 (of 17 total)
You must be logged in to reply to this topic. Login to reply