In this post, I’m presenting usage of two functions that use Regex. My previous post was about searching text in SQL Server using regex, and now I’m showing how to use it for replacing text in SQL Server.
The two new functions added to the SqlRegex.dll are RgxTrim() and RgxReplace().
Before starting with them I’ll mention the in-built functions for similar purposes, like the LTRIM and RTRIM functions and TRIM in SQL Server on Linux and starting with SQL 2017 on Windows. They are only trimming white spaces from the left, right or from both sides of a string.
What RgxTrim is doing additionally is replacing multiple white spaces inside the text with a single white space. RgxTrim does trimming of the leading and ending white spaces of an input string/text too.
After importing SqlRegex.dll in your database, run the following code to create the RgxTrim function.
CREATE FUNCTION dbo.RgxTrim(@Text NVARCHAR(max)) RETURNS NVARCHAR(MAX) WITH EXECUTE AS CALLER EXTERNAL NAME SqlRegex.UserDefinedFunctions.RgxTrim; GO
RgxTrim would successfully remove multiple tabs with a single white space for an input string/text as well. You can use RgxTrim in an update statement like any normal T-SQL function.
The second function is RgxReplace which is very similar to Replace:
REPLACE ( string_expression , string_pattern , string_replacement )
However, RgxReplace is designed to additionally use regular expression patterns for searching in text and do a replacement, while REPLACE uses purely written patterns only. RgxReplace has very similar syntax with REPLACE:
RgxReplace ( string_expression, regex_pattern , string_replacement ).
The only difference is the second parameter regex_pattern which can additionally accept regular expressions. Create RgxReplace with the following code.
CREATE FUNCTION [dbo].[RgxReplace](@Text [NVARCHAR](MAX), @pattern [NVARCHAR](MAX), @replacement [NVARCHAR](MAX)) RETURNS [NVARCHAR](MAX) WITH EXECUTE AS CALLER AS EXTERNAL NAME [SqlRegex].[UserDefinedFunctions].[RgxReplace] GO
The usage of RgxReplace is specific. For example, imagine a situation wherein the Description column of a table there is sensitive information like IBAN accounts, Credit Card numbers or Addresses for the clients/users. And when you display this information in an application you should carry about what you’re displaying and to whom. At least you should mask this sensitive information in a given case.
The following query finds all rows of a table that have a pattern for an account and performs masking of the same. The query demonstrates usage of RgxReplace.
SELECT UserID,[Description],dbo.RgxReplace([Description],'\d{4}-?\d{4}-?\d{4}-?\d{4}','XXXXXXXXXXXXXXXX') [Masked description] FROM dbo.UserNotes WHERE dbo.ContainsString([Description],'\d{4}-?\d{4}-?\d{4}-?\d{4}')=1;
The result from the query is given in the next figure.
From the figure above, you can see that all IBAN accounts are masked. Full names of the users and their addresses are manually masked. By using regex you can detect any kind of addresses, credit card numbers and etc., and combining with the ContainsString() the query from the figure could be written to do additional stuff. You can use the RgxReplace() function like any T-SQL function in an update statement.
You can now recreate everything from SqlRegex.dll (including the functions from the previous post) with the following script.