October 14, 2016 at 4:58 am
I've a string as follows:
45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc
I am expecting to get it as follows:
String1 - 45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg
String2 - TBL101 | PC | 1.00 | COMP101 | CS | 1.00......... etc
So what I am trying to do is to split the string into two string on 7th delimiter and tried the following but it works only for string1:
DECLARE @STR VARCHAR(1000),
@str1 VARCHAR(1000),
@str2 VARCHAR(1000)
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
/****I was actually following a tutorial on doing this but stuck on this statements. Would expect some explanation on this - Starts****/
--Creating a number table
;WITH numcte AS(
SELECT 1 AS rn UNION ALL SELECT rn + 1 FROM numcte WHERE rn <= LEN(@str)),
--Get the position of the "|" charecters
GetDelimitedCharPos AS(
SELECT ROW_NUMBER() OVER(ORDER BY getdate()) seq, rn, delimitedcharpos
FROM numcte
CROSS APPLY(SELECT SUBSTRING(@str, rn, 1)delimitedcharpos) m WHERE delimitedcharpos = '|')
/****I was actually following a tutorial on doing this but stuck on this statements. Would expect some explanation on this - Ends****/
--Applying the formula SUBSTRING(@str, startseq + 1, endseq-startseq + 1)
SELECT TOP 1
(CASE WHEN SUBSTRING(
,(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 6)
,(SELECT TOP 1 rn FROM GetDelimitedCharPos WHERE seq = 7 ) -
(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 6)
) = SUBSTRING(
,(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 6)
,(SELECT TOP 1 rn FROM GetDelimitedCharPos WHERE seq = 7 ) -
(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 6)
) THEN SUBSTRING(
,(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 1)
,(SELECT TOP 1 rn FROM GetDelimitedCharPos WHERE seq = 7 ) -
(SELECT TOP 1 rn + 1 FROM GetDelimitedCharPos WHERE seq = 1)
) ELSE '' END) AS String1
FROM GetDelimitedCharPos
I would expect some ideas from the experts to implement it perfectly. Thanks.
October 14, 2016 at 7:03 am
Here's a simple option using the DelimitedSplit8k that can be found here along with its explanation: http://www.sqlservercentral.com/articles/Tally+Table/72993/
A specific function could be made to split only on the 7th delimiter and avoid having to concatenate the strings again.
DECLARE @STR VARCHAR(1000),
@str1 VARCHAR(1000),
@str2 VARCHAR(1000)
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Randy Singh | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
SELECT s.Item, s.ItemNumber, (s.ItemNumber - 1) / 7 AS ItemGroup
INTO #Splitted
FROM dbo.DelimitedSplit8K( @STR, '|')s;
SELECT STUFF(( SELECT '| ' + i.Item
FROM #Splitted i
WHERE i.ItemGroup = s.ItemGroup
ORDER BY i.ItemNumber
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '')
FROM #Splitted s
GROUP BY ItemGroup;
DROP TABLE #Splitted;
October 14, 2016 at 7:14 am
Luis is a master at showing set based solutions and if you plan on using SQL for any length of time, I would understand why those solutions are preferable over something like this....
DECLARE @STR VARCHAR(1000),
@str1 VARCHAR(1000),
@str2 VARCHAR(1000),
@pos INT,
@counter INT
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Randy Singh | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
SET @counter=0
SET @pos=0
WHILE @counter <=7
BEGIN
SET @pos=CHARINDEX('|',@str,@pos+1)
set @counter=@counter+1
END
SET @str1=SUBSTRING(@str,1,@pos)
SET @str2=SUBSTRING(@str,@pos+1,LEN(@str)-@pos)
Print 'str1='+ @str1
Print 'str2='+ @str2
The solution I posted is the epitome of a RBAR type solution.....(looping through data to get to the final solution)
October 14, 2016 at 8:32 am
Luis Cazares (10/14/2016)
Here's a simple option using the DelimitedSplit8k that can be found here along with its explanation: http://www.sqlservercentral.com/articles/Tally+Table/72993/A specific function could be made to split only on the 7th delimiter and avoid having to concatenate the strings again.
DECLARE @STR VARCHAR(1000),
@str1 VARCHAR(1000),
@str2 VARCHAR(1000)
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Randy Singh | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
SELECT s.Item, s.ItemNumber, (s.ItemNumber - 1) / 7 AS ItemGroup
INTO #Splitted
FROM dbo.DelimitedSplit8K( @STR, '|')s;
SELECT STUFF(( SELECT '| ' + i.Item
FROM #Splitted i
WHERE i.ItemGroup = s.ItemGroup
ORDER BY i.ItemNumber
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '')
FROM #Splitted s
GROUP BY ItemGroup;
DROP TABLE #Splitted;
Please forgive me if I'm misunderstanding the requirement and for not coding an example (I'm on vacation without a PC).
I don't think a splitter or concatenation is required here. Why not just use a tally table and row_number() to identify the location of the 7th delimiter: let's call that value X. Then for string1:
SUBSTRING(yourstring, 1, X-1)
For string2:
SUBSTRING(yourstring, X+1, 8000)
-- Itzik Ben-Gan 2001
October 14, 2016 at 8:41 am
Here are 2 modified versions of the DelimitedSplit8k function. The second one will only work on 2012 and more recent versions.
CREATE FUNCTION [dbo].[DelimitedSplit8KNOccurrence](@pString [varchar](8000), @pDelimiter [char](1), @pOccurrence int)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover NVARCHAR(4000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1,G1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1, 0 UNION ALL
SELECT t.N+1 , ROW_NUMBER() OVER(ORDER BY N)
FROM cteTally t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s1.N1, ISNULL( s2.N1 - s1.N1 - 1, 8000)
FROM cteStart s1
LEFT JOIN cteStart s2 ON s1.G1/@pOccurrence = s2.G1/@pOccurrence - 1 AND s2.G1 % @pOccurrence = 0
WHERE s1.G1 % @pOccurrence = 0
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO
CREATE FUNCTION [dbo].[DelimitedSplit8KLeadNOccurrence](@pString [varchar](8000), @pDelimiter [char](1), @pOccurrence int)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1, G1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1, ROW_NUMBER() OVER(ORDER BY N) % @pOccurrence
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
FROM cteStart s
WHERE G1 = 1;
GO
With those, the final code would be simplified to a single statement.
DECLARE @STR VARCHAR(1000)
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Randy Singh | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
SELECT *
FROM dbo.DelimitedSplit8KNOccurrence( @STR, '|', 7)s;
I'm not able to claim any performance improvements, because I haven't test both approaches for performance. That doesn't mean that there won't be, it only means that you need to test them.
October 14, 2016 at 8:48 am
Alan.B (10/14/2016)
Luis Cazares (10/14/2016)
Here's a simple option using the DelimitedSplit8k that can be found here along with its explanation: http://www.sqlservercentral.com/articles/Tally+Table/72993/A specific function could be made to split only on the 7th delimiter and avoid having to concatenate the strings again.
DECLARE @STR VARCHAR(1000),
@str1 VARCHAR(1000),
@str2 VARCHAR(1000)
SET @STR ='45 | 00055 | 9/30/2016 | Vodafone | Randy Singh | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'
SELECT s.Item, s.ItemNumber, (s.ItemNumber - 1) / 7 AS ItemGroup
INTO #Splitted
FROM dbo.DelimitedSplit8K( @STR, '|')s;
SELECT STUFF(( SELECT '| ' + i.Item
FROM #Splitted i
WHERE i.ItemGroup = s.ItemGroup
ORDER BY i.ItemNumber
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '')
FROM #Splitted s
GROUP BY ItemGroup;
DROP TABLE #Splitted;
Please forgive me if I'm misunderstanding the requirement and for not coding an example (I'm on vacation without a PC).
I don't think a splitter or concatenation is required here. Why not just use a tally table and row_number() to identify the location of the 7th delimiter: let's call that value X. Then for string1:
SUBSTRING(yourstring, 1, X-1)
For string2:
SUBSTRING(yourstring, X+1, 8000)
That would work when there are 2 values. I'm not sure if that would always be the case.
October 14, 2016 at 12:02 pm
Here's a simple, brute-force -- but still no looping -- method, if you prefer that.
SELECT
string AS original_string,
CASE WHEN are_enough_delims_present = 0 THEN string
ELSE LEFT(string, string_delim_7 - 1) END AS string1,
CASE WHEN are_enough_delims_present = 0 THEN ''
ELSE LTRIM(SUBSTRING(string, string_delim_7 + 1, 8000)) END AS string2
FROM (
VALUES('45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc'),
('46 | xxx | not | enough | delims')
) AS test_data(string)
CROSS APPLY (
SELECT CAST(CASE WHEN string LIKE '%|%|%|%|%|%|%|%'
THEN 1 ELSE 0 END AS bit) AS are_enough_delims_present
) AS ca1
OUTER APPLY (
SELECT CHARINDEX('|', string) AS string_delim_1
WHERE are_enough_delims_present = 1
) AS ca2
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_1 + 1) AS string_delim_2
WHERE are_enough_delims_present = 1
) AS ca3
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_2 + 1) AS string_delim_3
WHERE are_enough_delims_present = 1
) AS ca4
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_3 + 1) AS string_delim_4
WHERE are_enough_delims_present = 1
) AS ca5
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_4 + 1) AS string_delim_5
WHERE are_enough_delims_present = 1
) AS ca6
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_5 + 1) AS string_delim_6
WHERE are_enough_delims_present = 1
) AS ca7
OUTER APPLY (
SELECT CHARINDEX('|', string, string_delim_6 + 1) AS string_delim_7
WHERE are_enough_delims_present = 1
) AS ca8
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 15, 2016 at 6:15 am
ScottPletcher (10/14/2016)
Here's a simple, brute-force -- but still no looping -- method, if you prefer that.
Not very efficient with 7 x ( source scan -> Filter -> Compute Scalar -> Nested Loops Join ) in the execution plan, better not use this on a large set.
😎
Here is a more efficient alternative, beats the c..p out of the string splitting functions
USE TEEST;
GO
SET NOCOUNT ON;
;WITH SAMPLE_DATA AS
(
SELECT (SMPL_STR) FROM
(
VALUES('45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc')
,('46 | xxx | not | enough | delims')
) AS X(SMPL_STR)
)
,DELIMITER_COUNT(DLMCNT,SMPL_STR) AS
(
SELECT
LEN(SD.SMPL_STR) - LEN(REPLACE(SD.SMPL_STR,CHAR(124),'')) AS DLMCNT
,SD.SMPL_STR AS SMPL_STR
FROM SAMPLE_DATA SD
)
SELECT
DC.SMPL_STR
,CASE
WHEN DC.DLMCNT > 6 THEN SUBSTRING(DC.SMPL_STR,1,CI7.CI - 1)
ELSE DC.SMPL_STR
END AS PART_ONE
,CASE
WHEN DC.DLMCNT > 6 THEN SUBSTRING(DC.SMPL_STR,CI7.CI + 1,LEN(DC.SMPL_STR))
ELSE NULL
END PART_TWO
FROM DELIMITER_COUNT DC
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR, 1) ) CI1(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI1.CI + 1) ) CI2(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI2.CI + 1) ) CI3(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI3.CI + 1) ) CI4(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI4.CI + 1) ) CI5(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI5.CI + 1) ) CI6(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI6.CI + 1) ) CI7(CI)
;
Output
SMPL_STR PART_ONE PART_TWO
--------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------- -------------------------------------------------------------
45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc 45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc
46 | xxx | not | enough | delims 46 | xxx | not | enough | delims NULL
Edit: Typo
October 16, 2016 at 11:29 pm
Eirikur Eiriksson (10/15/2016)
Here is a more efficient alternative, beats the c..p out of the string splitting functions
VERY cool concept but, in this case, it appears to be coming up with an incorrect result set. Yes, I absolutely agree that it has successfully split out the two parts correctly but it's spitting them out on the same row. The requirement, if I'm reading the original post correctly, is that each of the parts should reside on a separate row.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 17, 2016 at 12:35 am
Jeff Moden (10/16/2016)
Eirikur Eiriksson (10/15/2016)
Here is a more efficient alternative, beats the c..p out of the string splitting functionsVERY cool concept but, in this case, it appears to be coming up with an incorrect result set. Yes, I absolutely agree that it has successfully split out the two parts correctly but it's spitting them out on the same row. The requirement, if I'm reading the original post correctly, is that each of the parts should reside on a separate row.
Good catch Jeff, thanks. Here is a corrected version to fit the OP's requirements.
😎
USE TEEST;
GO
SET NOCOUNT ON;
;WITH SAMPLE_DATA AS
(
SELECT (SMPL_STR) FROM
(
VALUES('45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg | TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc')
,('46 | xxx | not | enough | delims')
) AS X(SMPL_STR)
)
,DELIMITER_COUNT(DLMCNT,SMPL_STR) AS
(
SELECT
LEN(SD.SMPL_STR) - LEN(REPLACE(SD.SMPL_STR,CHAR(124),'')) AS DLMCNT
,SD.SMPL_STR AS SMPL_STR
FROM SAMPLE_DATA SD
)
,CHOPPED_STRINGS AS
(
SELECT
CASE
WHEN DC.DLMCNT > 6 THEN SUBSTRING(DC.SMPL_STR,1,CI7.CI - 1)
ELSE DC.SMPL_STR
END AS PART_ONE
,CASE
WHEN DC.DLMCNT > 6 THEN SUBSTRING(DC.SMPL_STR,CI7.CI + 1,LEN(DC.SMPL_STR))
ELSE NULL
END PART_TWO
FROM DELIMITER_COUNT DC
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR, 1) ) CI1(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI1.CI + 1) ) CI2(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI2.CI + 1) ) CI3(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI3.CI + 1) ) CI4(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI4.CI + 1) ) CI5(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI5.CI + 1) ) CI6(CI)
CROSS APPLY (SELECT CHARINDEX(CHAR(124),DC.SMPL_STR,CI6.CI + 1) ) CI7(CI)
)
SELECT
STRING_PARTS.PART
FROM CHOPPED_STRINGS
CROSS APPLY
(
SELECT PART_ONE UNION ALL
SELECT PART_TWO
) STRING_PARTS(PART)
;
Output
PART
---------------------------------------------------------------------
45 | 00055 | 9/30/2016 | Vodafone | Jack Morris | Newyork | Test Msg
TBL101 | PC | 1.00 | COMP101 | CS | 1.00.............. etc
46 | xxx | not | enough | delims
NULL
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply