Convert a string with an array of variables?

  • Sergiy (3/30/2016)


    Solution is suprisingly simple.

    1. Create a "translation" table:

    CREATE TABLE #Mapping (

    FromChar NCHAR(1) PRIMARY KEY,

    ToChar NCHAR(1)

    )

    INSERT INTO #Mapping ( FromChar, ToChar )

    SELECT 1, 'A'

    UNION ALL

    SELECT 2, 'B'

    UNION ALL

    SELECT 3, 'C'

    UNION ALL

    SELECT 4, 'D'

    UNION ALL

    SELECT 5, 'E'

    UNION ALL

    SELECT 6, 'F'

    UNION ALL

    SELECT 7, 'G'

    UNION ALL

    SELECT 8, 'H'

    UNION ALL

    SELECT 9, 'I'

    UNION ALL

    SELECT 0, '$'

    And now - do the translation itself:

    DECLARE @String NVARCHAR(2000)

    SET @String = '89083'

    UPDATE #Mapping

    SET @String = REPLACE (@String, FromChar, ToChar)

    SELECT @String

    Very clever indeed. How would this be applied against a column in a table?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (3/31/2016)


    Sergiy (3/30/2016)


    Solution is suprisingly simple.

    1. Create a "translation" table:

    CREATE TABLE #Mapping (

    FromChar NCHAR(1) PRIMARY KEY,

    ToChar NCHAR(1)

    )

    INSERT INTO #Mapping ( FromChar, ToChar )

    SELECT 1, 'A'

    UNION ALL

    SELECT 2, 'B'

    UNION ALL

    SELECT 3, 'C'

    UNION ALL

    SELECT 4, 'D'

    UNION ALL

    SELECT 5, 'E'

    UNION ALL

    SELECT 6, 'F'

    UNION ALL

    SELECT 7, 'G'

    UNION ALL

    SELECT 8, 'H'

    UNION ALL

    SELECT 9, 'I'

    UNION ALL

    SELECT 0, '$'

    And now - do the translation itself:

    DECLARE @String NVARCHAR(2000)

    SET @String = '89083'

    UPDATE #Mapping

    SET @String = REPLACE (@String, FromChar, ToChar)

    SELECT @String

    Very clever indeed. How would this be applied against a column in a table?

    It doesn't work when updating a table. The reason for this is that the FROM clause (including the JOIN) creates a virtual table and the transformation is applied to each row of that virtual table (which only includes one match) and then the underlying table row is (logically) updated with each of those separate rows that contain only one transformation (overwriting any previous updates). I think the physical processing is smart enough to figure out that it only needs to apply the "last" update for each row.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Very clever indeed. How would this be applied against a column in a table?

    With some difficulty. I tried a few ideas & failed. Looking forward to seeing someone else crack it.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Try this, using a combination of a tally table and a lookup table.

    SELECT TOP 10000

    IDENTITY(INT,1,1) AS N

    INTO #Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    ALTER TABLE #Tally ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    declare @tabLookup table(id int identity(0,1),repchar char(1))

    insert into @tabLookup (repchar)

    SELECT '$' union all

    SELECT 'A' UNION ALL

    SELECT 'B' UNION ALL

    SELECT 'C' UNION ALL

    SELECT 'D' UNION ALL

    SELECT 'E' UNION ALL

    SELECT 'F' UNION ALL

    SELECT 'G' UNION ALL

    SELECT 'H' UNION ALL

    SELECT 'I'

    declare @Parameter nvarchar(20)

    set @Parameter = '0123456789'

    declare @NewParameter nvarchar(20)

    set @NewParameter = ''

    ;with Converted as

    (

    select t.N [Position],SUBSTRING(@Parameter,t.N,1) [Original],tl.repchar [Converted]

    FROM #Tally t

    join @tabLookup tl on tl.id = SUBSTRING(@Parameter,t.N,1)

    WHERE t.N <= LEN(@Parameter)

    --ORDER BY t.N

    )

    SELECT STUFF((SELECT '' + Converted

    FROM Converted order by position

    FOR XML PATH('')) ,1,0,'') AS Converted

    drop table #Tally

  • Sergiy (3/30/2016)


    Solution is suprisingly simple.

    1. Create a "translation" table:

    CREATE TABLE #Mapping (

    FromChar NCHAR(1) PRIMARY KEY,

    ToChar NCHAR(1)

    )

    INSERT INTO #Mapping ( FromChar, ToChar )

    SELECT 1, 'A'

    UNION ALL

    SELECT 2, 'B'

    UNION ALL

    SELECT 3, 'C'

    UNION ALL

    SELECT 4, 'D'

    UNION ALL

    SELECT 5, 'E'

    UNION ALL

    SELECT 6, 'F'

    UNION ALL

    SELECT 7, 'G'

    UNION ALL

    SELECT 8, 'H'

    UNION ALL

    SELECT 9, 'I'

    UNION ALL

    SELECT 0, '$'

    And now - do the translation itself:

    DECLARE @String NVARCHAR(2000)

    SET @String = '89083'

    UPDATE #Mapping

    SET @String = REPLACE (@String, FromChar, ToChar)

    SELECT @String

    It's actually MUCH faster to create a table using the VALUES expression, because most of the cost is writing and reading the temp table to tempdb. If you use this frequently, you might want to create an Inline Table-Valued Function to create the translation table.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Drew,

    I like your solution better. It is short and clean.

    Thanks,

    MC

  • mceventphoto (3/31/2016)


    Drew,

    I like your solution better. It is short and clean.

    Thanks,

    MC

    Did you look at the Translate8K solution?

    DECLARE @RandomString varchar(100) = '0123456789000';

    SELECT dbo.Translate8K(@RandomString,'1234567890','ABCDEFGHI$');

    -- Returns: $ABCDEFGHI$$$

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Phil Parkin (3/31/2016)


    Nice solution, Sergiy. You taught me a new trick!

    The execution plan is interesting: 75% on a 'clustered index update'. Of what?

    You welcome.

    Takes a twisted mind to come up with this. 🙂

    The "clustered index update" is actually the operations defined in the "SET" part if the query.

    Actual writing to the records is the last "UPDATE" part, which is on 0%.

    _____________
    Code for TallyGenerator

  • drew.allen (3/31/2016)


    It's actually MUCH faster to create a table using the VALUES expression, because most of the cost is writing and reading the temp table to tempdb. If you use this frequently, you might want to create an Inline Table-Valued Function to create the translation table.

    Drew

    Are you suggesting hardcoding the mapping data?

    Instead of placing it into a static table updateable from an application?

    Ah-tah-tah! Bad boy, bad boy!

    :hehe:

    _____________
    Code for TallyGenerator

  • Alan.B (3/31/2016)


    How would this be applied against a column in a table?

    It depends.

    Simplest way would be to add a new column, create a scalar function out of the script and apply that function to the column.

    I know, I know - hidden cursor.

    But I'd expect the source data and the mapping not to be changed too often, so the translation need to be done only once for every for new/updated record. Probably through a trigger.

    Should be tolerable.

    _____________
    Code for TallyGenerator

  • drew.allen (3/31/2016)


    It's actually MUCH faster to create a table using the VALUES expression, because most of the cost is writing and reading the temp table to tempdb. If you use this frequently, you might want to create an Inline Table-Valued Function to create the translation table.

    Drew

    That's very questionable.

    I have created 2 scalar functions, one reading mapping from a table, another one having it hardcoded:

    USE tempdb

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- =============================================

    -- Author:SF

    -- Create date:

    -- Description:

    -- =============================================

    CREATE FUNCTION NameTranslatorValuesMapping

    (

    @String nvarchar(2000)

    )

    RETURNS nvarchar(2000)

    AS

    BEGIN

    DECLARE @translation nvarchar(2000)

    DECLARE @Mapping TABLE (

    FromChar NCHAR(1) PRIMARY KEY,

    ToChar NCHAR(1)

    )

    INSERT INTO @Mapping ( FromChar, ToChar )

    SELECT FromChar, ToChar FROM dbo.Mapping

    UPDATE @Mapping

    SET @translation = REPLACE (ISNULL(@Translation, @String), FromChar, ToChar)

    RETURN @translation

    END

    GO

    CREATE FUNCTION dbo.NameTranslatorValues

    (

    @String nvarchar(2000)

    )

    RETURNS nvarchar(2000)

    AS

    BEGIN

    DECLARE @translation nvarchar(2000)

    DECLARE @Mapping TABLE (

    FromChar NCHAR(1) PRIMARY KEY,

    ToChar NCHAR(1)

    )

    INSERT INTO @Mapping ( FromChar, ToChar )

    VALUES (1, 'A'), ( 2, 'B'),( 3, 'C'),( 4, 'D'),(5, 'E'),(6, 'F'),(7, 'G'),( 8, 'H'),(9, 'I'),(0, '$')

    UPDATE @Mapping

    SET @translation = REPLACE (ISNULL(@Translation, @String), FromChar, ToChar)

    RETURN @translation

    END

    GO

    Then I ran the test:

    SET statistics IO ON

    SET statistics TIME ON

    SELECT ProductID,product

    FROM dbo.Product

    SELECT ProductID,product

    ,dbo.NameTranslatorValuesMapping ([product]) Translation

    FROM dbo.Product

    SELECT ProductID,product

    ,dbo.NameTranslatorValues([product]) Translation

    FROM dbo.Product

    SET statistics IO OFF

    SET statistics TIME OFF

    GO

    Outcome is quite inconclusive.

    The winner is different from run to run:

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 181 ms.

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 233 ms.

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 288 ms.

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 96 ms.

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 390 ms.

    (769 row(s) affected)

    Table 'Product'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 255 ms.

    But one thing is certain - there is no clear winner.

    _____________
    Code for TallyGenerator

  • mceventphoto (3/31/2016)


    Drew,

    I like your solution better. It is short and clean.

    Thanks,

    MC

    Which one are you talking about?

    I can't find any solution posted by Drew in this thread.

    _____________
    Code for TallyGenerator

  • Sergiy (3/30/2016)


    DECLARE @String NVARCHAR(2000)

    SET @String = '89083'

    UPDATE #Mapping

    SET @String = REPLACE (@String, FromChar, ToChar)

    SELECT @String

    Sergiy (3/31/2016)


    Phil Parkin (3/31/2016)


    Nice solution, Sergiy. You taught me a new trick!

    The execution plan is interesting: 75% on a 'clustered index update'. Of what?

    You welcome.

    Takes a twisted mind to come up with this. 🙂

    The "clustered index update" is actually the operations defined in the "SET" part if the query.

    Actual writing to the records is the last "UPDATE" part, which is on 0%.

    Yeah, that's pretty twisted, but I have to admit that it's one I haven't tried. I've tried several different approaches to the nested replace problem against a table using a table of original and replacement values, but it always ends up doing an unwanted loop and returning the cross join of the two sets. This is precisely what I was trying to avoid.

    You've given me another direction to go that I hadn't considered before. Thank you!

  • Alan.B (3/30/2016)


    This is the kind of thing that you can do with a Translate function (which T-SQL does not have). I generally don't like scalar UDFs but this guy is faster than anything I could do with an inline table valued function.

    CREATE FUNCTION dbo.Translate8K

    (

    @String varchar(8000),

    @SearchPattern varchar(100),

    @ReplacePattern varchar(100)

    )

    /****************************************************************************************

    Purpose:

    Function takes and input string (@string) and replaces all instances of each each

    character in @string with that it exists in @SearchPattern with the corresponding

    character in @ReplacePattern. For exmample, given the string "abc123abc",

    @SearchPattern "ab" and @ReplacePatern of "XZ". Translate8K will replace each letter "a"

    that exists in "abc123abc" with the letter X. Then every "b" that exists with a "Z".

    This: SELECT dbo.Translate8K('ba!!!ab', 'ab', 'XZ'); will return ZX!!!XZ.

    Parameters:

    @String = varchar(8000); The input string to translate

    @SearchPattern = varchar(100); The string that will be searched for in @String

    @ReplacePattern = varchar(100); All characters in the @SearchPattern are replaced with

    their corresponding character in the @SearchPattern

    Returns: varchar(8000)

    Developer notes:

    1. Scalar user defined functions (udf) generally perform badly and "Inline" scalar UDFs

    generally perform much better. The only way to get this logic into an "Inline scalar

    udf" would be to use a recursive CTE which, for this task, performs very badly. For

    more about "in scalar UDFs" see:

    http://www.sqlservercentral.com/Forums/Topic1338291-203-1.aspx

    2. When @SearchPattern is longer than @ReplacePattern then characters in @SearchPattern

    that have no corresponding characters in @ReplacePattern will be removed. Using the

    above example, if we remove the letter "Z" from like this:

    SELECT dbo.Translate8K('ba!!!ab', 'ab', 'X') -- returns: X!!!X

    3. When @ReplacePattern is longer than @SearchPattern the replacement characters that

    the characters in @ReplacePattern past without a corresponding character in

    @SearchPattern are ignored.

    Usage Examples:

    --===== (1) basic replace characters/remove characters

    -- Replace a with A, c with C, b with x and remove $ and #

    DECLARE @string1 varchar(20)='###$$$aaabbbccc$$$###';

    SELECToriginal = @string, translated = dbo.Translate8K(@string1,'acb#$','ACx');

    --===== (2) Format a phone number

    -- format phone (atomic value)

    DECLARE @string varchar(8000) = '(425) 555-1212';

    SELECT original = @string, Translated = dbo.Translate8K(@string,')( ','-');

    --===== (3) hide phone numbers, retain existing format

    WITH phoneNbrs(n,pn) AS

    (

    SELECT 1, '(425) 555-1212' UNION ALL SELECT 2, '425.555.1212' UNION ALL

    SELECT 3, '425-555-1212' UNION ALL SELECT 4, '4255551212'

    )

    SELECT n, pn AS before, [after] = dbo.Translate8K(pn,x,y)

    FROM phoneNbrs

    CROSS APPLY (VALUES('()-.0123456789','()-.**********')) t(x,y);

    --===== (4) Replace accent characters with normal characters (note the "double translate")

    DECLARE

    @string varchar(100) = 'Thë Quìck Greeñ Fox jumpëd over thë lázy dogs back!',

    @special1 varchar(32) = 'áâãäæèïéìëíîçåñòóôöõàøúüûùýÁÃÄ',

    @normal1 varchar(32) = 'aaaaaeieieiicanoooooaouuuuyAAAAA',

    @special2 varchar(32) = 'ÆÈÏÉÌËÍÎÅÑÒÓÔÖÕÀØÚÜÛÙÝ!',

    @normal2 varchar(32) = 'EIEIEIIANOOOOOAOUUUUY.';

    SELECT

    original = @string,

    newstring =

    dbo.Translate8K(dbo.Translate8K(@string,@special1,@normal1),@special2,@normal2);

    ------------------------------------------------------------------------------

    Revision History:

    Rev 00 - 20150518 Initial Development - Alan Burstein

    ****************************************************************************************/

    RETURNS varchar(8000) WITH SCHEMABINDING AS

    BEGIN

    WITH E1(N) AS

    (

    SELECT 1

    FROM (VALUES

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)

    ) t(N)

    ),

    iTally(N) AS

    (

    SELECT TOP(DATALENGTH(@SearchPattern)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT 1)))

    FROM E1 a CROSS JOIN E1 b

    )

    SELECT

    @string = REPLACE

    (

    @string COLLATE Latin1_General_BIN,

    SUBSTRING(@SearchPattern,n,1),

    SUBSTRING(@ReplacePattern,n,1)

    )

    FROM iTally;

    RETURN @string;

    END;

    GO

    Now that it the Oracle TRANSLATE function. Very nice, Alan. Thanks.

    Between this and Sergiy's post, now I can't wait to start playing. 😉

  • Sergiy (3/31/2016)


    Alan.B (3/31/2016)


    How would this be applied against a column in a table?

    It depends.

    Simplest way would be to add a new column, create a scalar function out of the script and apply that function to the column.

    I know, I know - hidden cursor.

    But I'd expect the source data and the mapping not to be changed too often, so the translation need to be done only once for every for new/updated record. Probably through a trigger.

    Should be tolerable.

    That's the method that I was talking about that I use at work and haven't posted here yet. I was going to create the typical million row table to demonstrate the speed of using a persisted computed column with a scalar function to populate the column.

    Glad there's another seriously twisted mind.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 43 total)

You must be logged in to reply to this topic. Login to reply