March 30, 2016 at 9:03 am
Does anyone have a better solution for this? I would think I have to be overlooking something.
I need to convert a string of characters to another string of characters... for example 89083 to HI$HC
I have a list of alphanumeric letters that need to replace certain numbers. The only way I can think of accomplishing this task is
concat(
case
when substring(col1,1,1) in 1 THEN 'A'
when substring(col1,1,1) in 2 THEN 'B'
when substring(col1,1,1) in 3 THEN 'C'
when substring(col1,1,1) in 4 THEN 'D'
when substring(col1,1,1) in 5 THEN 'E'
when substring(col1,1,1) in 6 THEN 'F'
when substring(col1,1,1) in 7 THEN 'G'
when substring(col1,1,1) in 8 THEN 'H'
when substring(col1,1,1) in 9 THEN 'I'
when substring(col1,1,1) in 0 THEN '$'
end,
case
when substring(col1,2,1) in 1 THEN 'A'
when substring(col1,2,1) in 2 THEN 'B'
when substring(col1,2,1) in 3 THEN 'C'
when substring(col1,2,1) in 4 THEN 'D'
when substring(col1,2,1) in 5 THEN 'E'
when substring(col1,2,1) in 6 THEN 'F'
when substring(col1,2,1) in 7 THEN 'G'
when substring(col1,2,1) in 8 THEN 'H'
when substring(col1,2,1) in 9 THEN 'I'
when substring(col1,2,1) in 0 THEN '$'
end,
case
when substring(col1,3,1) in 1 THEN 'A'
when substring(col1,3,1) in 2 THEN 'B'
when substring(col1,3,1) in 3 THEN 'C'
when substring(col1,3,1) in 4 THEN 'D'
when substring(col1,3,1) in 5 THEN 'E'
when substring(col1,3,1) in 6 THEN 'F'
when substring(col1,3,1) in 7 THEN 'G'
when substring(col1,3,1) in 8 THEN 'H'
when substring(col1,3,1) in 9 THEN 'I'
when substring(col1,3,1) in 0 THEN '$'
end,
and on and on and on
).
There has to be a better way...
March 30, 2016 at 9:09 am
Can you elaborate on the conversion rule?
Does every character map to another specific character, or is its position within the string also important?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
March 30, 2016 at 9:21 am
Each character maps to another specific character ... position in the string is not important.
1 always converts to A
2 always converts to B
3 always converts to C
etc...
March 30, 2016 at 9:27 am
Looks like DelimitedSplit8K and a "conversion table" would do the job.
the DelimitedSplit8K would split out the string into individual characters, and then you could join that to the conversion table
(OriginalChar CHAR, NewChar CHAR), and return NewChar for each one.
March 30, 2016 at 9:28 am
In that case you could just use nested REPLACEs.
Alternatively, you could use CROSS APPLY to do the successive replace operations.
I like that way because it's a bit easier for me to follow than explicitly doing the nesting.
Something like this:
CREATE TABLE #temp (original_string varchar(30));
INSERT INTO #temp VALUES
('89083'),
('10234'),
('111100044885');
SELECT original_string,
new_string
FROM #temp
CROSS APPLY
(SELECT REPLACE(original_string,'1','A'))A(A)
CROSS APPLY
(SELECT REPLACE(A,'2','B'))B(B)
CROSS APPLY
(SELECT REPLACE(B,'3','C'))C(C)
CROSS APPLY
(SELECT REPLACE(C,'4','D'))D(D)
CROSS APPLY
(SELECT REPLACE(D,'5','E'))E(E)
CROSS APPLY
(SELECT REPLACE(E,'6','F'))F(F)
CROSS APPLY
(SELECT REPLACE(F,'7','G'))G(G)
CROSS APPLY
(SELECT REPLACE(G,'8','H'))H(H)
CROSS APPLY
(SELECT REPLACE(H,'9','I'))I(I)
CROSS APPLY
(SELECT REPLACE(I,'0','$'))new_string(new_string);
DROP TABLE #temp;
If the mapping of original characters to new characters is not fixed, then it's a bit more complicated, but can still be done.
Cheers!
EDIT: Fixed a typo.
March 30, 2016 at 10:38 am
Thanks... I knew there had to be an easier way. I will check to see if the 8K is installed on that server or not. Maybe I'll get lucky. Otherwise, cross apply here I come.
March 30, 2016 at 10:57 am
pietlinden (3/30/2016)
Looks like DelimitedSplit8K and a "conversion table" would do the job.the DelimitedSplit8K would split out the string into individual characters, and then you could join that to the conversion table
(OriginalChar CHAR, NewChar CHAR), and return NewChar for each one.
Actually, you only need a tally table, because each substring is the exact same length (1 char). DelimitedSplit8K is only required when the lengths of the substrings is variable. It's easy enough to create a tally table on the fly if you don't already have one in your database.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
March 30, 2016 at 11:04 am
I've got a pretty good and uber simple process for doing such things. I use it to replace nearly 500 address parts and it's nasty fast. I'll try to put something together for you tonight after work.
To set expectations, I don't know if it'll beat the CROSS APPLY for the 10 digits for sure but there's no way that 500 CROSS APPLYs would have been faster for what I needed to do for my stuff.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 30, 2016 at 11:07 am
Kevlarmpowered (3/30/2016)
Thanks... I knew there had to be an easier way. I will check to see if the 8K is installed on that server or not. Maybe I'll get lucky. Otherwise, cross apply here I come.
Here's a partial solution using an in-line tally (copied shamelessly from here[/url]) and an ASCII conversion rule.
DECLARE @Col1 varchar(20) = '123123321321';
WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT SourceChar = substring(@Col1,n,1), NewChar = char(ascii(substring(@Col1,n,1)) + 16)
FROM Tally
WHERE n <= len(@Col1)
ORDER BY n;
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
March 30, 2016 at 11:23 am
<kidding>
DelimitedSplit, Tally table... it's all Jeff Moden voodoo... =)
March 30, 2016 at 8:44 pm
This is the kind of thing that you can do with a Translate function (which T-SQL does not have). I generally don't like scalar UDFs but this guy is faster than anything I could do with an inline table valued function.
CREATE FUNCTION dbo.Translate8K
(
@String varchar(8000),
@SearchPattern varchar(100),
@ReplacePattern varchar(100)
)
/****************************************************************************************
Purpose:
Function takes and input string (@string) and replaces all instances of each each
character in @string with that it exists in @SearchPattern with the corresponding
character in @ReplacePattern. For exmample, given the string "abc123abc",
@SearchPattern "ab" and @ReplacePatern of "XZ". Translate8K will replace each letter "a"
that exists in "abc123abc" with the letter X. Then every "b" that exists with a "Z".
This: SELECT dbo.Translate8K('ba!!!ab', 'ab', 'XZ'); will return ZX!!!XZ.
Parameters:
@String = varchar(8000); The input string to translate
@SearchPattern = varchar(100); The string that will be searched for in @String
@ReplacePattern = varchar(100); All characters in the @SearchPattern are replaced with
their corresponding character in the @SearchPattern
Returns: varchar(8000)
Developer notes:
1. Scalar user defined functions (udf) generally perform badly and "Inline" scalar UDFs
generally perform much better. The only way to get this logic into an "Inline scalar
udf" would be to use a recursive CTE which, for this task, performs very badly. For
more about "in scalar UDFs" see:
http://www.sqlservercentral.com/Forums/Topic1338291-203-1.aspx
2. When @SearchPattern is longer than @ReplacePattern then characters in @SearchPattern
that have no corresponding characters in @ReplacePattern will be removed. Using the
above example, if we remove the letter "Z" from like this:
SELECT dbo.Translate8K('ba!!!ab', 'ab', 'X') -- returns: X!!!X
3. When @ReplacePattern is longer than @SearchPattern the replacement characters that
the characters in @ReplacePattern past without a corresponding character in
@SearchPattern are ignored.
Usage Examples:
--===== (1) basic replace characters/remove characters
-- Replace a with A, c with C, b with x and remove $ and #
DECLARE @string1 varchar(20)='###$$$aaabbbccc$$$###';
SELECToriginal = @string, translated = dbo.Translate8K(@string1,'acb#$','ACx');
--===== (2) Format a phone number
-- format phone (atomic value)
DECLARE @string varchar(8000) = '(425) 555-1212';
SELECT original = @string, Translated = dbo.Translate8K(@string,')( ','-');
--===== (3) hide phone numbers, retain existing format
WITH phoneNbrs(n,pn) AS
(
SELECT 1, '(425) 555-1212' UNION ALL SELECT 2, '425.555.1212' UNION ALL
SELECT 3, '425-555-1212' UNION ALL SELECT 4, '4255551212'
)
SELECT n, pn AS before, [after] = dbo.Translate8K(pn,x,y)
FROM phoneNbrs
CROSS APPLY (VALUES('()-.0123456789','()-.**********')) t(x,y);
--===== (4) Replace accent characters with normal characters (note the "double translate")
DECLARE
@string varchar(100) = 'Thë Quìck Greeñ Fox jumpëd over thë lázy dogs back!',
@special1 varchar(32) = 'áâãäæèïéìëíîçåñòóôöõàøúüûùýÁÃÄ',
@normal1 varchar(32) = 'aaaaaeieieiicanoooooaouuuuyAAAAA',
@special2 varchar(32) = 'ÆÈÏÉÌËÍÎÅÑÒÓÔÖÕÀØÚÜÛÙÝ!',
@normal2 varchar(32) = 'EIEIEIIANOOOOOAOUUUUY.';
SELECT
original = @string,
newstring =
dbo.Translate8K(dbo.Translate8K(@string,@special1,@normal1),@special2,@normal2);
------------------------------------------------------------------------------
Revision History:
Rev 00 - 20150518 Initial Development - Alan Burstein
****************************************************************************************/
RETURNS varchar(8000) WITH SCHEMABINDING AS
BEGIN
WITH E1(N) AS
(
SELECT 1
FROM (VALUES
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),
(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)
) t(N)
),
iTally(N) AS
(
SELECT TOP(DATALENGTH(@SearchPattern)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT 1)))
FROM E1 a CROSS JOIN E1 b
)
SELECT
@string = REPLACE
(
@string COLLATE Latin1_General_BIN,
SUBSTRING(@SearchPattern,n,1),
SUBSTRING(@ReplacePattern,n,1)
)
FROM iTally;
RETURN @string;
END;
GO
-- Itzik Ben-Gan 2001
March 30, 2016 at 11:08 pm
Solution is suprisingly simple.
1. Create a "translation" table:
CREATE TABLE #Mapping (
FromChar NCHAR(1) PRIMARY KEY,
ToChar NCHAR(1)
)
INSERT INTO #Mapping ( FromChar, ToChar )
SELECT 1, 'A'
UNION ALL
SELECT 2, 'B'
UNION ALL
SELECT 3, 'C'
UNION ALL
SELECT 4, 'D'
UNION ALL
SELECT 5, 'E'
UNION ALL
SELECT 6, 'F'
UNION ALL
SELECT 7, 'G'
UNION ALL
SELECT 8, 'H'
UNION ALL
SELECT 9, 'I'
UNION ALL
SELECT 0, '$'
And now - do the translation itself:
DECLARE @String NVARCHAR(2000)
SET @String = '89083'
UPDATE #Mapping
SET @String = REPLACE (@String, FromChar, ToChar)
SELECT @String
_____________
Code for TallyGenerator
March 31, 2016 at 3:28 am
Phil Parkin (3/30/2016)
Kevlarmpowered (3/30/2016)
Thanks... I knew there had to be an easier way. I will check to see if the 8K is installed on that server or not. Maybe I'll get lucky. Otherwise, cross apply here I come.Here's a partial solution using an in-line tally (copied shamelessly from here[/url]) and an ASCII conversion rule.
DECLARE @Col1 varchar(20) = '123123321321';
WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT SourceChar = substring(@Col1,n,1), NewChar = char(ascii(substring(@Col1,n,1)) + 16)
FROM Tally
WHERE n <= len(@Col1)
ORDER BY n;
I know you said it's a partial solution, but this solution replaces the 0 with @ instead of $.
The following adaptation will return the correct strings
DECLARE @colTable TABLE (
id INT IDENTITY(1,1)
, Col1 VARCHAR(20)
);
INSERT INTO @colTable (Col1)
VALUES ('123123321321'), ('3248643120'), ('6843103545');
WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT src.Col1
, newCol = ( SELECT CASE WHEN SUBSTRING(ct.Col1, t.n, 1) = '0' THEN '$'
ELSE CHAR(ASCII(SUBSTRING(ct.Col1, t.n, 1)) + 16)
END
FROM @colTable AS ct
CROSS APPLY Tally AS t
WHERE t.n <= len(@Col1)
AND ct.id = src.id
ORDER BY ct.id, t.n
FOR XML PATH('')
)
FROM @colTable AS src
ORDER BY src.id;
March 31, 2016 at 6:10 am
Sergiy (3/30/2016)
Solution is suprisingly simple.1. Create a "translation" table:
CREATE TABLE #Mapping (
FromChar NCHAR(1) PRIMARY KEY,
ToChar NCHAR(1)
)
INSERT INTO #Mapping ( FromChar, ToChar )
SELECT 1, 'A'
UNION ALL
SELECT 2, 'B'
UNION ALL
SELECT 3, 'C'
UNION ALL
SELECT 4, 'D'
UNION ALL
SELECT 5, 'E'
UNION ALL
SELECT 6, 'F'
UNION ALL
SELECT 7, 'G'
UNION ALL
SELECT 8, 'H'
UNION ALL
SELECT 9, 'I'
UNION ALL
SELECT 0, '$'
And now - do the translation itself:
DECLARE @String NVARCHAR(2000)
SET @String = '89083'
UPDATE #Mapping
SET @String = REPLACE (@String, FromChar, ToChar)
SELECT @String
Nice solution, Sergiy. You taught me a new trick!
The execution plan is interesting: 75% on a 'clustered index update'. Of what?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
March 31, 2016 at 8:25 am
Phil Parkin (3/31/2016)
Sergiy (3/30/2016)
Solution is suprisingly simple.1. Create a "translation" table:
CREATE TABLE #Mapping (
FromChar NCHAR(1) PRIMARY KEY,
ToChar NCHAR(1)
)
INSERT INTO #Mapping ( FromChar, ToChar )
SELECT 1, 'A'
UNION ALL
SELECT 2, 'B'
UNION ALL
SELECT 3, 'C'
UNION ALL
SELECT 4, 'D'
UNION ALL
SELECT 5, 'E'
UNION ALL
SELECT 6, 'F'
UNION ALL
SELECT 7, 'G'
UNION ALL
SELECT 8, 'H'
UNION ALL
SELECT 9, 'I'
UNION ALL
SELECT 0, '$'
And now - do the translation itself:
DECLARE @String NVARCHAR(2000)
SET @String = '89083'
UPDATE #Mapping
SET @String = REPLACE (@String, FromChar, ToChar)
SELECT @String
Nice solution, Sergiy. You taught me a new trick!
The execution plan is interesting: 75% on a 'clustered index update'. Of what?
I know right I was like...
Wait, what...Whoa!
Thanks for giving my brain a kick start this morning Sergiy.
Viewing 15 posts - 1 through 15 (of 43 total)
You must be logged in to reply to this topic. Login to reply