October 10, 2009 at 6:17 pm
I am new to MS Sql. I am trying to find a function which will split a long sentence into multiple sentences of fixed length...
For example
'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines'
needs to be split as
'This is a long sentence'
'and I want to split it'
'into multiple '
'sentences of 25'
'character each with '
'out splitting a word '
'into multiple lines'
Any help on this is greatly appreciated.
regards
giri
October 12, 2009 at 3:03 pm
giri,
If you'd share the business reason behind this requirement (I'm always curious), I believe I may have a way to pull it off.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 13, 2009 at 5:39 am
The data needs to get pulled into a columnar report. There are 5 columns of fixed length. When the data gets into this report, I don't want the words to get split into 2 lines as it would affect the readability. One way of doing this is by manipulating the front end application. The other way which I thought was easier one was through the stored procedures.
October 17, 2009 at 7:49 pm
I am SO sorry about this post. I totally lost track of it. Thank you for taking the time to explain. I've got the code somewhere... I just need to find it again.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 17, 2009 at 8:31 pm
Found it. Again, sorry for the delay.
First, you need a Tally table which is nothing more than a table with a single column of sequential numbers. For more information on how a Tally table works to replace a loop, please see the following article...
http://www.sqlservercentral.com/articles/T-SQL/62867/
Here's the code to build a "standard" 11,000 row Tally table but do read the article above to really understand what it does...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Here's the code you need for your 25 character sentences. You could paramaterize the "25" in the code to make a User Defined Function that would have some bit a flexability...
--===== Declare a variable to hold a long string.
-- This would likely be a parameter in a User Defined Function
DECLARE @pLongString VARCHAR(8000);
SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';
--===== We need a place to hold split words and determine which line of 25 characters each belongs to.
DECLARE @Return TABLE
(
WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Word VARCHAR(25),
Line INT
);
WITH
--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variable
cteSplit AS
(
SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1)
FROM dbo.Tally t
WHERE N <= LEN(@pLongString)+1
AND SUBSTRING(' '+@pLongString,t.N,1) = ' '
)
INSERT INTO @Return
(Word)
SELECT Word
FROM cteSplit;
--===== Declare some obviously named variables and preset them for use
DECLARE @Width INT,
@Line INT,
@Dummy INT;
SELECT @Width = 0,
@Line = 1;
--===== This counts up the length of each word and assigns them to groups of 25 characters
-- which will eventually become the 25 character "sentences". The method used is
-- known as the "Quirky Update".
UPDATE @Return
SET @Width = @Width + LEN(Word),
@Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END,
@Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END,
@Dummy = WordNum
FROM @Return
OPTION (MAXDOP 1);
--===== Concatenate all the words for each numbered sentence.
-- This would be the return for a User Defined Function.
SELECT t1.Line,
(SELECT ''+t2.Word
FROM @return t2
WHERE t2.Line = t1.Line --Correlation here
ORDER BY t2.Line,t2.WordNum
FOR XML PATH('')
) AS SomeLetters2
FROM @return t1
GROUP BY t1.Line -- without GROUP BY multiple rows are returned
ORDER BY t1.Line -- Remove this line for a User Defined Function
;
GO
--Jeff Moden
Change is inevitable... Change for the better is not.
October 21, 2009 at 6:56 pm
Did it help?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 26, 2009 at 7:18 am
Jeff
That code doesn't quite work. For example, row 1 is returned as "This is a long sentence and I", which is 29 characters. I think you've forgotten to count spaces between words. If you change line
"SET @Width = @Width + LEN(Word)" to
"SET @Width = @Width + LEN(Word)+1", that fixes it for me.
I know it's RBAR, but this also works for me, and may be easier for newbies to follow:
ALTER FUNCTION FNSplitSentenceInto25Chars
(@sentence varchar(1000))
/************************************************************************
Alun Basten
26/10/2009
This function takes a sentence and splits it into lines of up to
25 characters in length without splitting any words
************************************************************************/
RETURNS @CharSplit table
(RowNumber tinyint,
LineOfText varchar(25),
LengthOfText tinyint)
AS
BEGIN
--Declare variables
DECLARE @lineoftext varchar(26)
DECLARE @lastspace tinyint
DECLARE @charsleft tinyint
DECLARE @rownumber tinyint
--Initialise variables
SELECT @charsleft=len(@sentence)
SELECT @rownumber=0
--Loop through the sentence parameter, 26 chars at a time
WHILE @charsleft>0
BEGIN
--Retrieve the first 26 chars. If the 26th char is a space, we can just take the first 25 chars
--as our string. If the 26th char is not a space, find the space nearest to the end
SELECT @lineoftext=LEFT(@sentence,26)
SELECT @lastspace=CHARINDEX(' ',REVERSE(@lineoftext),1)
SELECT @lineoftext=LEFT(@lineoftext,26-@lastspace)
SELECT @sentence=LTRIM(SUBSTRING(@sentence,27-@lastspace,1000))
SELECT @charsleft=LEN(@sentence)
SELECT @rownumber=@rownumber+1
INSERT @CharSplit
SELECT @rownumber,@lineoftext,LEN(@lineoftext)
END
RETURN
END
/*************************************************************************************
Example of how to run:
SELECT * from FNSplitSentenceInto25Chars('This is a very long sentence and I would like to split it into multiple sentences of 25 characters each without splitting a word into multiple lines')
**************************************************************************************/
Regards
Alun
October 29, 2009 at 10:51 pm
Sorry for the delay in my reply. Your code worked for me. Thanks.
October 30, 2009 at 10:35 pm
T.S.
Alun was correct... I forgot to count the trailing space after each word. By changing LEN to DATALENGTH in my previous code, the answer is correctly resolved. Here's the code...
--===== Declare a variable to hold a long string.
-- This would likely be a parameter in a User Defined Function
DECLARE @pLongString VARCHAR(8000);
SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';
--===== We need a place to hold split words and determine which line of 25 characters each belongs to.
DECLARE @Return TABLE
(
WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Word VARCHAR(25),
Line INT
);
WITH
--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variable
cteSplit AS
(
SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1)
FROM dbo.Tally t
WHERE N <= LEN(@pLongString)+1
AND SUBSTRING(' '+@pLongString,t.N,1) = ' '
)
INSERT INTO @Return
(Word)
SELECT Word
FROM cteSplit;
--===== Declare some obviously named variables and preset them for use
DECLARE @Width INT,
@Line INT,
@Dummy INT;
SELECT @Width = 0,
@Line = 1;
--===== This counts up the length of each word and assigns them to groups of 25 characters
-- which will eventually become the 25 character "sentences". The method used is
-- known as the "Quirky Update".
UPDATE @Return
SET @Width = @Width + LEN(Word),
@Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END,
@Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END,
@Dummy = WordNum
FROM @Return
OPTION (MAXDOP 1);
--Jeff Moden
Change is inevitable... Change for the better is not.
October 30, 2009 at 10:38 pm
mister boom (10/26/2009)
I know it's RBAR, but this also works for me, and may be easier for newbies to follow:
Thanks for the feedback, Alun, but that's exactly what I'm worried about... 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 15, 2017 at 1:21 pm
Jeff Moden - Friday, October 30, 2009 10:35 PMT.S.Alun was correct... I forgot to count the trailing space after each word. By changing LEN to DATALENGTH in my previous code, the answer is correctly resolved. Here's the code...--===== Declare a variable to hold a long string. -- This would likely be a parameter in a User Defined FunctionDECLARE @pLongString VARCHAR(8000); SELECT @pLongString = 'This is a long sentence and I want to split it into multiple sentences of 25 character each with out splitting a word into multiple lines.';--===== We need a place to hold split words and determine which line of 25 characters each belongs to.DECLARE @Return TABLE ( WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, Word VARCHAR(25), Line INT );WITH--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variablecteSplit AS( SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1) FROM dbo.Tally t WHERE N <= LEN(@pLongString)+1 AND SUBSTRING(' '+@pLongString,t.N,1) = ' ') INSERT INTO @Return (Word) SELECT Word FROM cteSplit;--===== Declare some obviously named variables and preset them for useDECLARE @Width INT, @Line INT, @Dummy INT; SELECT @Width = 0, @Line = 1;--===== This counts up the length of each word and assigns them to groups of 25 characters -- which will eventually become the 25 character "sentences". The method used is -- known as the "Quirky Update". UPDATE @Return SET @Width = @Width + LEN(Word), @Line = Line = CASE WHEN @Width > 25 THEN @Line + 1 ELSE @Line END, @Width = CASE WHEN @Width > 25 THEN LEN(Word) ELSE @Width END, @Dummy = WordNum FROM @Return OPTION (MAXDOP 1);
Hello,
This is my first post and I know this is a really old thread but the information here almost has me to the end of what I need to do. First of all thank you to everyone who posted.
I need to split an entire table of long sentences into 30 characters each, which sounds about what this does. It works great for the first record but the second record is actually very short and so the third record begins in place of the second. Source Data:
Split Data:
Record #4 Text should say "OUTSIDE PROCESS -"
Record #5 Text should begin with "1" D-SERIES DRIVE"
The "Line" column is not working out correctly because of how short the Text is in recrod 2 above.
My code to create the Tally table was taken from above and I modified the code above to read from the source table. I have pasted it below. Any assistance would be greatly appreciated.
Sorry I'm unable to find out how to post the code properly so it is just showing up on one line unfortunately. It is almost the same, just set for 30 and pulling from a table shown in the first screenshot above.
--===== Declare a variable to hold a long string. -- This would likely be a parameter in a User Defined FunctionDECLARE @pLongString VARCHAR(8000);DECLARE @StockCode VARCHAR(50);DECLARE @PrevStockCode VARCHAR(50);--===== We need a place to hold split words and determine which line of 30 characters each belongs to.DECLARE @Return TABLE ( WordNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,StockCode VARCHAR(50),TextType CHAR(1), Word VARCHAR(30), Line INT );DECLARE @MyCursor CURSOR;BEGINSET @MyCursor = CURSOR FORSELECT StockCode, UpdatedDesc FROM SharpeNarrations..ConcatTechNotes_P WHERE id Between 1 and 3OPEN @MyCursorFETCH NEXT FROM @MyCursorINTO @StockCode, @pLongStringWHILE @@FETCH_STATUS = 0BEGINWITH--====== cteSplit splits all of the words and their trailing spaces and stores them in a table variablecteSplit AS( SELECT Word = SUBSTRING(@pLongString,t.N,CHARINDEX(' ',@pLongString+' ',t.N)-t.N+1) FROM tempdb.dbo.Tally tWHERE N <= LEN(@pLongString)+1AND SUBSTRING(' '+@pLongString,t.N,1) = ' ') INSERT INTO @Return(StockCode, TextType, Word) SELECT @StockCode, 'P', Word FROM cteSplit;DECLARE @q XML = (SELECT * FROM @Return FOR XML AUTO)--===== Declare some obviously named variables and preset them for useDECLARE @Width INT,@Line INT,@Dummy INT; SELECT @Width = 0,@Line = 1;--Trying to increment the line number by testing against previous stock code to see if I can get record #3 '1 IN D-SERIES DRIVE' to not--overlap record #2 '010003'SELECT @Line = CASE WHEN (@StockCode = @PrevStockCode) OR (@PrevStockCode = '') THEN @Line ELSE @Line + 1 END--===== This counts up the length of each word and assigns them to groups of 30 characters -- which will eventually become the 30 character "sentences". The method used is -- known as the "Quirky Update". UPDATE @ReturnSET @Width = @Width + DATALENGTH(Word),@Line = Line = CASE WHEN @Width > 30 THEN @Line + 1 ELSE @Line END,@Width = CASE WHEN @Width > 30 THEN DATALENGTH(Word) ELSE @Width END,@Dummy = WordNum FROM @Return OPTION (MAXDOP 1); SELECT @PrevStockCode = @StockCodeFETCH NEXT FROM @MyCursorINTO @StockCode, @pLongStringEND;CLOSE @MyCursor;DEALLOCATE @MyCursor;END;--===== Concatenate all the words for each numbered sentence. -- This would be the return for a User Defined Function. SELECT t1.StockCode, TextType, t1.Line, (SELECT ''+t2.Word FROM @Return t2 WHERE t2.Line = t1.Line --Correlation here ORDER BY t2.Line,t2.WordNum FOR XML PATH('') ) AS Text FROM @Return t1 GROUP BY t1.StockCode, t1.TextType, t1.Line -- without GROUP BY multiple rows are returned ORDER BY t1.StockCode, t1.TextType, t1.Line -- Remove this line for a User Defined Function;GO
Thank you in advance,
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply