November 15, 2016 at 1:53 am
I need to replace in given string all uppercase strings longer than two characters with lowercase and set html bold tags around them. Any leads?
For example convert this:
Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna.
into this:
Lorem ipsum dolor sit amet, <b>consec</b>tetur adipi<b>sci</b>ng elit. PelleNTesque in <b>elit</b> ut ex accumsan sagittis in a urna.
November 15, 2016 at 2:17 am
I think it would be quite a complicated solution in T-SQL, so much so that you may consider using a different language, but if you want to do it, here is an outline:
(1) Split the string[/url] into individual characters
(2) Learn about gaps and islands[/url] to identify consecutive upper case characters. You'll need to use a case-sensitive collation
(3) Add the HTML tags before and after each island of consecutive upper case characters
(4) Re-concatenate[/url] into a single string
John
November 15, 2016 at 4:32 am
dothedev (11/15/2016)
I need to replace in given string all uppercase strings longer than two characters with lowercase and set html bold tags around them. Any leads?For example convert this:
Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna.
into this:
Lorem ipsum dolor sit amet, <b>consec</b>tetur adipi<b>sci</b>ng elit. PelleNTesque in <b>elit</b> ut ex accumsan sagittis in a urna.
Here is a quick solution which works for the English character set, must be tweaked for other languages.
😎
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @SAMPLE_STRING VARCHAR(2000) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna.';
DECLARE @DESIRED_OUTPUT VARCHAR(2000) = 'Lorem ipsum dolor sit amet, <b>consec</b>tetur adipi<b>sci</b>ng elit. PelleNTesque in <b>elit</b> ut ex accumsan sagittis in a urna.';
;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
, NUMS(N) AS (SELECT TOP(LEN(@SAMPLE_STRING)) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N FROM T T1,T T2,T T3,T T4)
,BASE_DATA AS
(
SELECT
NM.N
,SUBSTRING(@SAMPLE_STRING,NM.N,1) AS CHR_VAL
,CASE
WHEN ASCII(SUBSTRING(@SAMPLE_STRING,NM.N,1)) BETWEEN 65 AND 90 THEN 1
ELSE 0
END AS IS_CAPS
FROM NUMS NM
)
,CHANGE_IDENTIFICATION AS
(
SELECT
BD.N
,BD.CHR_VAL
,BD.IS_CAPS
,CASE WHEN BD.IS_CAPS
= LAG(BD.IS_CAPS,1,BD.IS_CAPS) OVER
(
ORDER BY BD.N
) THEN 0 ELSE 1 END AS IS_CHANGED
FROM BASE_DATA BD
)
,GROUPED_SET AS
(
SELECT
CI.N
,CI.CHR_VAL
,CI.IS_CAPS
,SUM(CI.IS_CHANGED) OVER
(
ORDER BY CI.N
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS GRP_ID
FROM CHANGE_IDENTIFICATION CI
)
,GROUPS_AND_SIZES AS
(
SELECT
GS.GRP_ID
,MAX(GS.IS_CAPS) AS IS_CAPS
,COUNT(GS.N) AS GRP_SIZE
,(SELECT
'' + SGS.CHR_VAL
FROM GROUPED_SET SGS
WHERE GS.GRP_ID = SGS.GRP_ID
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(2000)') AS GR_STR
FROM GROUPED_SET GS
GROUP BY GS.GRP_ID
)
SELECT 'INPUT' AS STR_NAME, @SAMPLE_STRING AS OUTPUT_STRING UNION ALL
SELECT 'OUTPUT' AS STR_NAME,
(
SELECT
CASE
WHEN GAS.IS_CAPS = 1 AND GAS.GRP_SIZE > 2 THEN '<b>' + LOWER(GAS.GR_STR) + '</b>'
ELSE GAS.GR_STR
END
FROM GROUPS_AND_SIZES GAS
FOR XML PATH(''),TYPE
).value('(./text())[1]','VARCHAR(2000)') AS OUTPUT_STR
UNION ALL
SELECT 'DESIRED' AS STR_NAME, @DESIRED_OUTPUT AS OUTPUT_STR;
Output
STR_NAME OUTPUT_STRING
-------- --------------------------------------------------------------------------------------------------------------------------------------
INPUT Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna.
OUTPUT Lorem ipsum dolor sit amet, <b>consec</b>tetur adipi<b>sci</b>ng elit. PelleNTesque in <b>elit</b> ut ex accumsan sagittis in a urna.
DESIRED Lorem ipsum dolor sit amet, <b>consec</b>tetur adipi<b>sci</b>ng elit. PelleNTesque in <b>elit</b> ut ex accumsan sagittis in a urna.
November 15, 2016 at 6:00 am
Wow, thanks! :w00t:
Just needed to add some (finnish) characters in the IS_CAPS case and that's it!
It's going to take a while to understand though 🙂
November 15, 2016 at 6:16 am
dothedev (11/15/2016)
Wow, thanks! :w00t:Just needed to add some (finnish) characters in the IS_CAPS case and that's it!
It's going to take a while to understand though 🙂
You are very welcome.
😎
To extend the character range, add to the case statement in the BASE_DATA CTE
SELECT
NM.N
,SUBSTRING(@SAMPLE_STRING,NM.N,1) AS CHR_VAL
,CASE
WHEN ASCII(SUBSTRING(@SAMPLE_STRING,NM.N,1)) BETWEEN 65 AND 90 THEN 1
-- Add additional characters here
ELSE 0
END AS IS_CAPS
FROM NUMS NM
November 15, 2016 at 6:38 am
Here is one more way:
declare @Org varchar(150) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna'
declare @Results varchar(200) = ''
declare @ChIndex int
--Getting the first occurence of more then 1 capital leter
WHILE patindex('%[A-Z][A-Z]%', @Org) > 0
BEGIN
--Get the location of the first place with more then 1 capital letter
select @ChIndex = patindex('%[A-Z][A-Z]%', @Org)
--Adding to the results string the part that doesn't need to be modified
set @Results = @Results + substring(@Org,1, @ChIndex - 1) + '
'
--Deleting the part that was added to the results from the original string
set @Org = substring(@Org,@ChIndex,len(@Org))
--Getting the end location of the capital letters
select @ChIndex = patindex('%[^A-Z]%',@Org)
--Adding the needed part to the @results string
SELECT @Results = @Results + lower(substring(@Org,1,@ChIndex - 1)) + '</BR>'
--Deleting the part from the @Org string
set @Org = substring(@Org,@ChIndex,len(@Org))
END
SET @Results = @Results + @Org
select @Results
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
November 15, 2016 at 7:26 am
Adi Cohn-120898 (11/15/2016)
Here is one more way:
declare @Org varchar(150) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna'
declare @Results varchar(200) = ''
declare @ChIndex int
--Getting the first occurence of more then 1 capital leter
WHILE patindex('%[A-Z][A-Z]%', @Org) > 0
BEGIN
--Get the location of the first place with more then 1 capital letter
select @ChIndex = patindex('%[A-Z][A-Z]%', @Org)
--Adding to the results string the part that doesn't need to be modified
set @Results = @Results + substring(@Org,1, @ChIndex - 1) + '
'
--Deleting the part that was added to the results from the original string
set @Org = substring(@Org,@ChIndex,len(@Org))
--Getting the end location of the capital letters
select @ChIndex = patindex('%[^A-Z]%',@Org)
--Adding the needed part to the @results string
SELECT @Results = @Results + lower(substring(@Org,1,@ChIndex - 1)) + '</BR>'
--Deleting the part from the @Org string
set @Org = substring(@Org,@ChIndex,len(@Org))
END
SET @Results = @Results + @Org
select @Results
Adi
Throws an error if the substring doesn't contain the pattern.
😎
Msg 537, Level 16, State 2, Line 23
Invalid length parameter passed to the LEFT or SUBSTRING function.
Further, the output doesn't seem to match the requirements
lorem</BR>
ipsum</BR>
dolor</BR>
sit</BR>
amet</BR>,
consectetur</BR>
adipiscing</BR>
elit</BR>.
pellentesque</BR>
in</BR>
elit</BR>
ut</BR>
ex</BR>
accumsan</BR>
sagitti
November 15, 2016 at 7:47 am
Here's an option using a modified version of the pattern splitter explained in here: http://www.sqlservercentral.com/articles/String+Manipulation/94365/
The only modification was to make it CASE sensitive.
ALTER FUNCTION [dbo].[PatternSplitCM](@List [varchar](8000) = NULL, @Pattern [varchar](50))
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH numbers AS (
SELECT TOP(ISNULL(DATALENGTH(@List), 0))
n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n))
SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY MIN(n)),
Item = SUBSTRING(@List,MIN(n),1+MAX(n)-MIN(n)),
[Matched]
FROM (
SELECT n, y.[Matched], Grouper = n - ROW_NUMBER() OVER(ORDER BY y.[Matched],n)
FROM numbers
CROSS APPLY (
SELECT [Matched] = CASE WHEN SUBSTRING(@List,n,1) LIKE @Pattern COLLATE Latin1_General_Bin THEN 1 ELSE 0 END
) y
) d
GROUP BY [Matched], Grouper;
Then the solution becomes a bit easier.
DECLARE @String varchar(1000);
SET @String = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna. '
SELECT ( SELECT CASE WHEN ItemNumber > 1 AND Matched = 1
THEN '<b>' + LOWER(Item) + '</b>'
ELSE Item END
FROM dbo.PatternSplitCM(@String, '%[A-Z]%' )x
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)')
November 15, 2016 at 8:02 am
I've done few mistakes. Should have checked it better. The first mistake was that for some reason without noticing it, I've done a small modification in my script. Instead of adding the BR as requested, I added carriage return + new line (enter key). This is the reason that in my script you got few lines instead of one. The second mistake was that I checked my script on a server that is configured as case sensitive and forgot to test it on a server that is not case sensitive. I've modified the script so it will work on none case sensitive servers as well. Sorry for the mistakes, I've should have checked it better.
declare @Org varchar(150) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna' COLLATE Latin1_General_BIN
declare @Results varchar(200) = '' COLLATE Latin1_General_BIN
declare @ChIndex int
--Getting the first occurence of more then 1 capital leter
WHILE patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org) > 0
BEGIN
--Get the location of the first place with more then 1 capital letter
select @ChIndex = patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org)
--Adding to the results string the part that doesn't need to be modified
set @Results = @Results + substring(@Org,1, @ChIndex - 1) + '
'
--Deleting the part that was added to the results from the original string
set @Org = substring(@Org,@ChIndex,len(@Org))
--Getting the end location of the capital letters
select @ChIndex = patindex('%[^A-Z]%' COLLATE Latin1_General_BIN,@Org)
IF @ChIndex = 0
set @ChIndex = len(@Org)+1
--Adding the needed part to the @results string
SELECT @Results = @Results + lower(substring(@Org,1,@ChIndex - 1)) + '</BR>'
--Deleting the part from the @Org string
set @Org = substring(@Org,@ChIndex,len(@Org))
END
SET @Results = @Results + @Org
select @Results
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
November 15, 2016 at 8:24 am
Adi Cohn-120898 (11/15/2016)
I've done few mistakes. Should have checked it better. The first mistake was that for some reason without noticing it, I've done a small modification in my script. Instead of adding the BR as requested, I added carriage return + new line (enter key). This is the reason that in my script you got few lines instead of one. The second mistake was that I checked my script on a server that is configured as case sensitive and forgot to test it on a server that is not case sensitive. I've modified the script so it will work on none case sensitive servers as well. Sorry for the mistakes, I've should have checked it better.
declare @Org varchar(150) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna' COLLATE Latin1_General_BIN
declare @Results varchar(200) = '' COLLATE Latin1_General_BIN
declare @ChIndex int
--Getting the first occurence of more then 1 capital leter
WHILE patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org) > 0
BEGIN
--Get the location of the first place with more then 1 capital letter
select @ChIndex = patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org)
--Adding to the results string the part that doesn't need to be modified
set @Results = @Results + substring(@Org,1, @ChIndex - 1) + '
'
--Deleting the part that was added to the results from the original string
set @Org = substring(@Org,@ChIndex,len(@Org))
--Getting the end location of the capital letters
select @ChIndex = patindex('%[^A-Z]%' COLLATE Latin1_General_BIN,@Org)
IF @ChIndex = 0
set @ChIndex = len(@Org)+1
--Adding the needed part to the @results string
SELECT @Results = @Results + lower(substring(@Org,1,@ChIndex - 1)) + '</BR>'
--Deleting the part from the @Org string
set @Org = substring(@Org,@ChIndex,len(@Org))
END
SET @Results = @Results + @Org
select @Results
Adi
You still got some work to do, the requirements are to wrap the upper case portions in <b></b>, not
. Further, it still adds new line and hence is returning multiple lines.
😎
November 15, 2016 at 8:25 am
Luis Cazares (11/15/2016)
Here's an option using a modified version of the pattern splitter explained in here: http://www.sqlservercentral.com/articles/String+Manipulation/94365/The only modification was to make it CASE sensitive.
Nice one Luis!
😎
November 15, 2016 at 8:29 am
dothedev (11/15/2016)
Wow, thanks! :w00t:Just needed to add some (finnish) characters in the IS_CAPS case and that's it!
It's going to take a while to understand though 🙂
Can you list the characters so the guys can add it to they're solutions?
😎
November 15, 2016 at 8:31 am
Eirikur Eiriksson (11/15/2016)
Luis Cazares (11/15/2016)
Here's an option using a modified version of the pattern splitter explained in here: http://www.sqlservercentral.com/articles/String+Manipulation/94365/The only modification was to make it CASE sensitive.
Nice one Luis!
😎
Thank you, but the credit should go to Chris Morris and Dwain Camps for creating the function and sharing it with everyone.
November 15, 2016 at 8:41 am
I read it as BR and have to modify it to B, but this is something that the original poster can modify. I also think that there is a bug that modified my code. It still shows that I add carriage return and new line instead of
. This time I've made sure in my code that I don't add the enter key. In any case I changed it to <B> and hope that it won't change it again.
declare @Org varchar(150) = 'Lorem ipsum dolor sit amet, CONSECtetur adipiSCIng elit. PelleNTesque in ELIT ut ex accumsan sagittis in a urna' COLLATE Latin1_General_BIN
declare @Results varchar(200) = '' COLLATE Latin1_General_BIN
declare @ChIndex int
--Getting the first occurence of more then 1 capital leter
WHILE patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org) > 0
BEGIN
--Get the location of the first place with more then 1 capital letter
select @ChIndex = patindex('%[A-Z][A-Z]%' COLLATE Latin1_General_BIN, @Org)
--Adding to the results string the part that doesn't need to be modified
set @Results = @Results + substring(@Org,1, @ChIndex - 1) + '<B>'
--Deleting the part that was added to the results from the original string
set @Org = substring(@Org,@ChIndex,len(@Org))
--Getting the end location of the capital letters
select @ChIndex = patindex('%[^A-Z]%' COLLATE Latin1_General_BIN,@Org)
IF @ChIndex = 0
set @ChIndex = len(@Org)+1
--Adding the needed part to the @results string
SELECT @Results = @Results + lower(substring(@Org,1,@ChIndex - 1)) + '</B>'
--Deleting the part from the @Org string
set @Org = substring(@Org,@ChIndex,len(@Org))
END
SET @Results = @Results + @Org
select @Results
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
November 16, 2016 at 12:42 am
The characters I needed to add are: Å,Ä,Ö / char(196),char(197),char(214)
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply