June 19, 2013 at 10:21 am
Hi,
Need your help in accomplishing this one. We run on SQL Server 2008. I've a string that can be upto 200 characters max and after every 32 characters a pile (|) got to be inserted. if that 32 characters come in midword, then that pipe has to be placed in the beginning of that word, not in the middle of the word.
Example:
Original String: ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZA BCDEFGHIJKLMNOP
Want this way: ABCDEFGH IJKLMNOPQRS TUVWXYZ |ABCDEFGHIJKLMN OPQRSTUVWXYZA |BCDEFGHIJKLMNOP
since these strings are in a table.column and this need to operate on the entire data set (about 200k records), performance is also important.
thanks in advance..
June 19, 2013 at 3:12 pm
So are you talking about updating the table once-off, or whenever a client app retrieves the data? Plus what if the previous "word" is less than 32 characters long and a pipe has been inserted, what do you do?
June 19, 2013 at 3:41 pm
Thanks for your response. Its kind of when the client retrieves the data.
again, the 32 character count should start from where the pipe was placed.
June 20, 2013 at 3:45 am
I'd strongly suggest you do this on the client side. SQL isn't geared towards manipulating strings.
June 20, 2013 at 8:53 am
Venkata-433090 (6/19/2013)
Thanks for your response. Its kind of when the client retrieves the data.again, the 32 character count should start from where the pipe was placed.
Shouldn't take too much effort to tailor this to your requirements:
-- sample data
;WITH SampleData AS (
SELECT
MyString = RTRIM(STUFF(STUFF(MyString,32,0,'|'),64,0,'|'))
FROM (
SELECT MyString = CAST('ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZA BCDEFGHIJKLMNOP' AS VARCHAR(200))
) d
)
-- solution
SELECT
MyString,
NewString =
STUFF(
STUFF(
STUFF(
STUFF(
STUFF(
STUFF(
CleanString
,Pos1,0,CASE WHEN Pos1 > 1 THEN '|' ELSE '' END)
,Pos2,0,CASE WHEN Pos2 > 1 THEN '|' ELSE '' END)
,Pos3,0,CASE WHEN Pos3 > 1 THEN '|' ELSE '' END)
,Pos4,0,CASE WHEN Pos4 > 1 THEN '|' ELSE '' END)
,Pos5,0,CASE WHEN Pos5 > 1 THEN '|' ELSE '' END)
,Pos6,0,CASE WHEN Pos6 > 1 THEN '|' ELSE '' END)
FROM SampleData
CROSS APPLY (SELECT CleanString = REPLACE(MyString,'|','')) y
CROSS APPLY ( -- get the position of the last ' ' in each 32 character 'line'
SELECT
Pos1 = MAX(CASE WHEN line = 1 THEN LastSpaceInLine+1 ELSE 1 END),
Pos2 = MAX(CASE WHEN line = 2 THEN LastSpaceInLine+1 ELSE 1 END),
Pos3 = MAX(CASE WHEN line = 3 THEN LastSpaceInLine+1 ELSE 1 END),
Pos4 = MAX(CASE WHEN line = 4 THEN LastSpaceInLine+1 ELSE 1 END),
Pos5 = MAX(CASE WHEN line = 5 THEN LastSpaceInLine+1 ELSE 1 END),
Pos6 = MAX(CASE WHEN line = 6 THEN LastSpaceInLine+1 ELSE 1 END)
FROM (
SELECT
x.line,
LastSpaceInLine = MAX(n)
FROM dbo.InlineTally (200) t
CROSS APPLY (SELECT line = 1+n/32) x
WHERE n <= LEN(MyString)
AND SUBSTRING(MyString,t.n,1) = ' '
GROUP BY x.line
) d
) x1
-- Function dbo.InlineTally
CREATE FUNCTION [dbo].[InlineTally]
(@RowCount INT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
(
WITH E1(N) AS ( -- 10 rows
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c), -- 10 x 10 x 10 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b) --1000 x 1000 rows max
SELECT TOP (@RowCount) n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 21, 2013 at 2:08 am
Now why does this InlineTally function look so familiar? π
June 21, 2013 at 2:31 am
Jan Van der Eecken (6/21/2013)
Now why does this InlineTally function look so familiar? π
Same reason as Feynman diagrams are now so familiar, Jan π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 21, 2013 at 10:22 am
June 24, 2013 at 2:54 pm
Thanks you all guys for your responses. I'll implement this and let you know..
June 24, 2013 at 4:32 pm
Venkata-433090 (6/24/2013)
Thanks you all guys for your responses. I'll implement this and let you know..
My question would be, what do you want to do with "words" that are more than 32 characters. Heh... don't say it won't happen.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 24, 2013 at 4:37 pm
ChrisM@Work (6/20/2013)
Venkata-433090 (6/19/2013)
Thanks for your response. Its kind of when the client retrieves the data.again, the 32 character count should start from where the pipe was placed.
Shouldn't take too much effort to tailor this to your requirements:
-- sample data
;WITH SampleData AS (
SELECT
MyString = RTRIM(STUFF(STUFF(MyString,32,0,'|'),64,0,'|'))
FROM (
SELECT MyString = CAST('ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZA BCDEFGHIJKLMNOP' AS VARCHAR(200))
) d
)
-- solution
SELECT
MyString,
NewString =
STUFF(
STUFF(
STUFF(
STUFF(
STUFF(
STUFF(
CleanString
,Pos1,0,CASE WHEN Pos1 > 1 THEN '|' ELSE '' END)
,Pos2,0,CASE WHEN Pos2 > 1 THEN '|' ELSE '' END)
,Pos3,0,CASE WHEN Pos3 > 1 THEN '|' ELSE '' END)
,Pos4,0,CASE WHEN Pos4 > 1 THEN '|' ELSE '' END)
,Pos5,0,CASE WHEN Pos5 > 1 THEN '|' ELSE '' END)
,Pos6,0,CASE WHEN Pos6 > 1 THEN '|' ELSE '' END)
FROM SampleData
CROSS APPLY (SELECT CleanString = REPLACE(MyString,'|','')) y
CROSS APPLY ( -- get the position of the last ' ' in each 32 character 'line'
SELECT
Pos1 = MAX(CASE WHEN line = 1 THEN LastSpaceInLine+1 ELSE 1 END),
Pos2 = MAX(CASE WHEN line = 2 THEN LastSpaceInLine+1 ELSE 1 END),
Pos3 = MAX(CASE WHEN line = 3 THEN LastSpaceInLine+1 ELSE 1 END),
Pos4 = MAX(CASE WHEN line = 4 THEN LastSpaceInLine+1 ELSE 1 END),
Pos5 = MAX(CASE WHEN line = 5 THEN LastSpaceInLine+1 ELSE 1 END),
Pos6 = MAX(CASE WHEN line = 6 THEN LastSpaceInLine+1 ELSE 1 END)
FROM (
SELECT
x.line,
LastSpaceInLine = MAX(n)
FROM dbo.InlineTally (200) t
CROSS APPLY (SELECT line = 1+n/32) x
WHERE n <= LEN(MyString)
AND SUBSTRING(MyString,t.n,1) = ' '
GROUP BY x.line
) d
) x1
-- Function dbo.InlineTally
CREATE FUNCTION [dbo].[InlineTally]
(@RowCount INT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
(
WITH E1(N) AS ( -- 10 rows
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c), -- 10 x 10 x 10 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b) --1000 x 1000 rows max
SELECT TOP (@RowCount) n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
Very cool but try it with the following replacement and see what happens with "zz".
SELECT MyString = CAST('aa bb cc dd ee ff gg hh ii jj kk ll mm nn oo pp qq rr ss tt uu vv ww xx yy zz' AS VARCHAR(200))
--Jeff Moden
Change is inevitable... Change for the better is not.
June 25, 2013 at 1:54 pm
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create FUNCTION String_Insert ( @String VARCHAR(MAX) )
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @i INT= 0
DECLARE @x INT= 0
DECLARE @max-2 INT = LEN(LTRIM(RTRIM(@String)))
DECLARE @REV VARCHAR(32)
SET @i = 32;
WHILE @i <= @max-2
BEGIN
IF SUBSTRING(@String, @i, 1) = ' '
BEGIN
SET @String = STUFF(@String, @i, 0, '|')
END
ELSE
BEGIN
SET @x = CHARINDEX(' ',
REVERSE(SUBSTRING(@String, 1, @i)))
SET @i = @i - @x + 2;
SET @String = STUFF(@String, @i, 0, '|')
END
SET @i = @i + 32;
END
-- Return the result of the function
RETURN @String
END
GO
Here I created a temp table and tested on it.
IF object_ID('tempDB..#temp') IS NOT NULL
DROP TABLE #temp
CREATE TABLE #temp
(
String VARCHAR(200)
)
INSERT #temp (String) VALUES ( 'ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZA BCDEFGHIJKLMNOP'
),('ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZAB CDEFGHIJKLMNOP')
SELECT String_Insert(String) FROM #temp
Thanks to my friend Lakshmi who helped me in writing the code.This is one way of doing it! Hope it works.
June 25, 2013 at 1:56 pm
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create FUNCTION String_Insert ( @String VARCHAR(MAX) )
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @i INT= 0
DECLARE @x INT= 0
DECLARE @max-2 INT = LEN(LTRIM(RTRIM(@String)))
DECLARE @REV VARCHAR(32)
SET @i = 32;
WHILE @i <= @max-2
BEGIN
IF SUBSTRING(@String, @i, 1) = ' '
BEGIN
SET @String = STUFF(@String, @i, 0, '|')
END
ELSE
BEGIN
SET @x = CHARINDEX(' ',
REVERSE(SUBSTRING(@String, 1, @i)))
SET @i = @i - @x + 2;
SET @String = STUFF(@String, @i, 0, '|')
END
SET @i = @i + 32;
END
-- Return the result of the function
RETURN @String
END
GO
Here I created a temp table and tested on it.
IF object_ID('tempDB..#temp') IS NOT NULL
DROP TABLE #temp
CREATE TABLE #temp
(
String VARCHAR(200)
)
INSERT #temp (String) VALUES ( 'ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZA BCDEFGHIJKLMNOP'
),('ABCDEFGH IJKLMNOPQRS TUVWXYZ ABCDEFGHIJKLMN OPQRSTUVWXYZAB CDEFGHIJKLMNOP')
SELECT String_Insert(String) FROM #temp
This is one way of doing it! Hope it works.
June 25, 2013 at 5:56 pm
Here is an iTVF for it...
CREATE FUNCTION SplitWithDelimiter (@MyString VARCHAR(200), @MaxLength INT, @Delimiter CHAR(1), @Breakers varchar(256))
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
SELECT substring(base.value, 1, Pos1) + @Delimiter + substring(base.value, Pos1 + 1, Pos2 - Pos1) + @Delimiter + substring(base.value, Pos2 + 1, Pos3 - Pos2) + @Delimiter + substring(base.value, Pos3 + 1, Pos4 - Pos3) + @Delimiter + substring(base.value, Pos4 + 1, Pos5 - Pos4) + @Delimiter + substring(base.value, Pos5 + 1, Pos6 - Pos5) + @Delimiter + substring(base.value, Pos6 + 1, Pos7 - Pos6) AS result
FROM (
VALUES (@MyString)
) AS base(value)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, 1, @MaxLength))), (@MaxLength + 1)), @MaxLength)
) a(Pos1)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos1 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos1
) b(Pos2)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos2 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos2
) c(Pos3)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos3 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos3
) d(Pos4)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos4 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos4
) e(Pos5)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos5 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos5
) f(Pos6)
CROSS APPLY (
SELECT ISNULL(NULLIF((@MaxLength + 1) - PATINDEX('%'+@Breakers+'%', REVERSE(SUBSTRING(base.value, Pos6 + 1, @MaxLength))), (@MaxLength + 1)), @MaxLength) + Pos6
) g(Pos7)
I don't claim it is the quickest, but it is quite flexible:
Parameters:
@MyString : The data to be manipulated
@MaxLength : The maximum length of each new "chunk" of data
@Delimiter : The character used to seperate the new "chunks"
@Breakers : Pattern used to find a suitable break in the data - can be as simple as ' ' or more complicated like '[ .,?!:;)]'
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
June 25, 2013 at 8:46 pm
Here's a recursive CTE method as a iTVF that could work.
I would like to come up with a none recursive method, but brain fade is setting in and I'm not seeing a way to do it currently :crying:
One thing to note is that it will break if there is a word longer than the @maxlen specified. The character at that point will be replaced.
CREATE FUNCTION splitAt(@str varchar(8000), @maxlen int, @splitOn varchar(1), @replaceWith varchar(1)) RETURNS TABLE AS
RETURN
WITH rcte as (
SELECT substring(@str,0,@maxlen - charindex(@spliton,reverse(substring(@str,0,@maxlen)))) s,
substring(@str,@maxlen - charindex(@spliton,reverse(substring(@str,0,@maxlen))) + 1,8000) r
UNION ALL
SELECT substring(r,0,@maxlen - charindex(@spliton,reverse(substring(r,0,@maxlen)))) s,
substring(r,@maxlen - charindex(@spliton,reverse(substring(r,0,@maxlen))) + 1,8000) r
FROM rcte
WHERE len(r) > 0
)
SELECT SUBSTRING(S,2,8000) splitResult
FROM (
SELECT @replacewith + s AS [text()]
FROM rCte
ORDER BY LEN(r) DESC
FOR XML PATH ('')) A (S)
On my machine with 10000 address rows with an average length of 55 (max 121, min 21) it took around a second.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply