Replacing a CASE statement in an update with table-driven logic

  • Hi Folks

    I have a stored proc that contains an update which utilizes a case statement to populate values in a particular column in a table, based on values found in other columns within the same table. The existing update looks like this (object names and values have been changed to protect the innocent):

    UPDATE dbo.target_table

    set target_column =

    case

    when source_column_1= 'ABC'then 'XYZ'

    when source_column_2= '123'then 'PDQ'

    when source_column_3= 'DEF'then 'BLAH'

    when source_column_3= 'FOO'then 'Fighters'

    when source_column_3= 'DOG'then 'CAT'

    when source_column_3= 'RED'then 'Blue'

    when source_column_3in ('UP','DOWN') then 'Strange'

    when source_column_4= 'ICE'then 'Tea'

    when source_column_4= 'PHX'then 'Phoenix'

    when source_column_4= 'YES'then 'Indeed'

    when source_column_4in ('CAR','BOS') then 'Stuff'

    end

    The powers that be would like to replace this case statement with some sort of table-driven structure, so that the mapping rules defined above can be maintained in the database by the business owner, rather than having it embedded in code and thus requiring developer intervention to perform changes/additions to the rules.

    The rules defined in the case statement are in a pre-defined sequence which reflects the order of precedence in which the rules are to be applied (in other words, if a matching value in source_column_1 is found, this trumps a conflicting matching value in source_column_2, etc). A case statement handles this nicely, of course, because the case statement will stop when it finds the first "hit" amongst the WHEN clauses, testing each in the order in which they are coded in the proc logic.

    What I'm struggling with is how to replicate this using a lookup table of some sort and joins from the target table to the lookup to replace the above case statement. I'm thinking that I would need a lookup table that has column name/value pairings, with a sequence number on each row that designates the row's placement in the precedence hierarchy. I'd then join to the lookup table somehow based on column names and values and return the match with the lowest sequence number, or something to that effect. Has anyone done something like this before? Is it even worth doing? I can feel myself going down a road of multiple joins to the lookup table aliased multiple times, and having that update statement getting awfully klugey. Or should I instead be trying to convince TPTB that the best approach is to keep the logic embedded in the code and maintain it via service desk requests? (The data is not likely to change more than a few times a year.)

    Thanks!

  • Wow, good question.

    If you were to model this in the database as you suggest, I would think that the best-performing solution would be one which generated dynamic SQL to perform the update in much the same way as you have done. That would be fun to code!

    Let's see what others come up with.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Quick "skeleton" suggestion

    😎

    USE tempdb;

    GO

    ;WITH BASE_DATE(BD_ID,BD_FLAG) AS

    ( SELECT * FROM (VALUES

    (1,'ABC' )

    ,(2,'123' )

    ,(3,'DEF' )

    ,(4,'FOO' )

    ,(5,'DOG' )

    ,(6,'RED' )

    ,(7,'UP' )

    ,(8,'DOWN')

    ,(9,'ICE' )

    ,(10,'PHX' )

    ,(11,'YES' )

    ,(12,'CAR' )

    ,(13,'BOS' )

    ) AS X(BD_ID,BD_FLAG)

    )

    ,TRANSLATE_SET(V_BEFORE,V_AFTER) AS

    (SELECT * FROM (VALUES

    ('ABC','XYZ' )

    ,('123','PDQ' )

    ,('DEF','BLAH' )

    ,('FOO','Fighters' )

    ,('DOG','CAT' )

    ,('RED','Blue' )

    ,('UP' ,'Strange' )

    ,('DOWN','Strange')

    ,('ICE','Tea' )

    ,('PHX','Phoenix' )

    ,('YES','Indeed' )

    ,('CAR','Stuff' )

    ,('BOS','Stuff' )

    ) AS X(V_BEFORE,V_AFTER)

    )

    SELECT

    BD.BD_ID

    ,BD.BD_FLAG

    ,'BECOMES' AS TR_ACTION

    ,TS.V_AFTER

    FROM BASE_DATE BD

    LEFT OUTER JOIN TRANSLATE_SET TS

    ON BD.BD_FLAG = TS.V_BEFORE;

    Results

    BD_ID BD_FLAG TR_ACTION V_AFTER

    ----------- ------- --------- --------

    1 ABC BECOMES XYZ

    2 123 BECOMES PDQ

    3 DEF BECOMES BLAH

    4 FOO BECOMES Fighters

    5 DOG BECOMES CAT

    6 RED BECOMES Blue

    7 UP BECOMES Strange

    8 DOWN BECOMES Strange

    9 ICE BECOMES Tea

    10 PHX BECOMES Phoenix

    11 YES BECOMES Indeed

    12 CAR BECOMES Stuff

    13 BOS BECOMES Stuff

  • Thanks for the replies. I forgot to mention one further complication that does occur in the real data: You can see the same value in different source columns, but have that map to a different output value depending on which source column the value is in. For example:

    WHEN source_column_1 = 'ABC' then 'XYZ'

    WHEN source_column_3 = 'ABC' then '999'

    Because of that condition, I don't think that the potential solution from Eirikur will work. (Or I'm just not seeing how to manage that scenario using that methodology.)

  • Quick addition of another key element (SRC_ID)

    😎

    USE tempdb;

    GO

    ;WITH BASE_DATE(BD_ID,SRC_ID,BD_FLAG) AS

    ( SELECT BD_ID,SRC_ID,BD_FLAG FROM (VALUES

    (1 ,1,'ABC' )

    ,(2 ,1,'123' )

    ,(3 ,1,'DEF' )

    ,(4 ,2,'ABC' )

    ,(5 ,2,'123' )

    ,(6 ,2,'DEF' )

    ,(7 ,3,'ABC' )

    ,(8 ,3,'123' )

    ,(9 ,3,'DEF' )

    ,(10,4,'ABC' )

    ,(11,4,'123' )

    ,(12,4,'DEF' )

    ) AS X(BD_ID,SRC_ID,BD_FLAG)

    )

    ,TRANSLATE_SET(SRC_ID,V_BEFORE,V_AFTER) AS

    (SELECT * FROM (VALUES

    (1,'ABC','XYZ' )

    ,(1,'123','PDQ' )

    ,(1,'DEF','BLAH' )

    ,(2,'ABC','Fighters' )

    ,(2,'123','CAT' )

    ,(2,'DEF','Blue' )

    ,(3,'ABC','Strange' )

    ,(3,'123','Strange')

    ,(3,'DEF','Tea' )

    ,(4,'ABC','Phoenix' )

    ,(4,'123','Indeed' )

    ,(4,'DEF','Stuff' )

    ) AS X(SRC_ID,V_BEFORE,V_AFTER)

    )

    SELECT

    BD.BD_ID

    ,BD.SRC_ID

    ,BD.BD_FLAG

    ,'BECOMES' AS TR_ACTION

    ,TS.V_AFTER

    FROM BASE_DATE BD

    INNER JOIN TRANSLATE_SET TS

    ON BD.BD_FLAG = TS.V_BEFORE

    WHERE BD.SRC_ID = TS.SRC_ID ;

    Results

    BD_ID SRC_ID BD_FLAG TR_ACTION V_AFTER

    ----------- ----------- ------- --------- --------

    1 1 ABC BECOMES XYZ

    2 1 123 BECOMES PDQ

    3 1 DEF BECOMES BLAH

    4 2 ABC BECOMES Fighters

    5 2 123 BECOMES CAT

    6 2 DEF BECOMES Blue

    7 3 ABC BECOMES Strange

    8 3 123 BECOMES Strange

    9 3 DEF BECOMES Tea

    10 4 ABC BECOMES Phoenix

    11 4 123 BECOMES Indeed

    12 4 DEF BECOMES Stuff

  • Thanks again, Eirikur. I've been thinking along similar lines as far as how the table structures would look. The fun part (as Phil suggested) is figuring out how to enforce the hierarchy in this setup. A row in the target table might contain data that meets multiple mapping rules, and I want to be sure that I always get the appropriate match based on the hierarchy established in the business rules. I think I'd either have to perform multiple updates (one update statement for each source column value, not the optimal solution in my opinion), or use dynamic SQL to build a case statement-based update that looks like what I started out with, but using this table as a driver. I'd probably add a sequence number to the table to use as an ordering mechanism when building the dynamic sql. Does that make sense or is there a simpler solution that I'm not seeing here?

  • Exactly what I had in mind.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • So here's what I came up with (thanks for the nudge in the right direction, all). I think this will do the trick.

    Lookup table:

    CREATE TABLE [dbo].[Target_Value_Map](

    [map_rule_sequence_no] [int] NOT NULL,

    [target_column_name] [varchar](50) NOT NULL,

    [source_column_name] [varchar](50) NOT NULL,

    [source_column_value] [varchar](50) NOT NULL,

    [target_column_value] [varchar](50) NOT NULL,

    CONSTRAINT [PK_Target_Value_Map] PRIMARY KEY CLUSTERED

    (

    [map_rule_sequence_no] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    Dynamic SQL to generate Update script with CASE statement, driven by lookup table:

    declare @sql varchar(1000)

    declare @source_column_name varchar(50)

    declare @source_column_value varchar(50)

    declare @target_column_value varchar(50)

    set @sql = 'update target_table set target_column = case '

    declare dynamicSQLBuild_CSR cursor for

    select source_column_name,source_column_value,target_column_value

    from dbo.Target_Value_Map

    where target_column_name = 'target_column'

    order by map_rule_sequence_no

    open dynamicSQLBuild_CSR

    fetch next from dynamicSQLBuild_CSR into @source_column_name,@source_column_value,@target_column_value

    WHILE @@FETCH_STATUS = 0

    BEGIN

    set @sql = @sql + 'when ' + @source_column_name + ' = ''' + @source_column_value + ''' then ''' + @target_column_value + '''' + ' '

    fetch next from dynamicSQLBuild_CSR into @source_column_name,@source_column_value,@target_column_value

    END

    close dynamicSQLBuild_CSR

    deallocate dynamicSQLBuild_CSR

    set @sql = @sql + 'end'

    select @sql

    The output of the script looks like this:

    update dbo.target_table

    set NC_Segment =

    case

    when source_column_1 = 'BLA' then 'BLABLABLA'

    when source_column_2 = 'ABC' then '123'

    when source_column_3 = 'XYZ' then '987'

    when source_column_3 = 'DEF' then 'ZZZ'

    when source_column_4 = 'FOO' then 'Fighters'

    when source_column_4 = 'BOO' then 'Ghost'

    when source_column_4 = 'CAT' then 'Dog'

    end

    Everything in the dynamic case statement is in the right sequence, which was the key issue in all of this. My plan is to add the code snippet above to my existing stored proc, then call sp_executesql @sql to execute the update. This way the rules can be maintained in the table and the proc code will always work (as long as whoever owns the data knows that they need to maintain the map_rule_sequence_no values properly).

    Thoughts? If there's an easier way to do this, I'm all for it.

  • If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..

    ALTER dbo.target_table

    ADD target_column AS

    case

    when source_column_1 = 'ABC' then 'XYZ'

    when source_column_2 = '123' then 'PDQ'

    when source_column_3 = 'DEF' then 'BLAH'

    when source_column_3 = 'FOO' then 'Fighters'

    when source_column_3 = 'DOG' then 'CAT'

    when source_column_3 = 'RED' then 'Blue'

    when source_column_3 in ('UP','DOWN') then 'Strange'

    when source_column_4 = 'ICE' then 'Tea'

    when source_column_4 = 'PHX' then 'Phoenix'

    when source_column_4 = 'YES' then 'Indeed'

    when source_column_4 in ('CAR','BOS') then 'Stuff'

    end

    Then, for example:

    SELECT target_column, ...

    FROM ...

    WHERE

    target_column IN ('Tea', 'Indeed')

    ORDER BY

    target_column

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (10/20/2014)


    If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..

    ALTER dbo.target_table

    ADD target_column AS

    case

    when source_column_1 = 'ABC' then 'XYZ'

    when source_column_2 = '123' then 'PDQ'

    when source_column_3 = 'DEF' then 'BLAH'

    when source_column_3 = 'FOO' then 'Fighters'

    when source_column_3 = 'DOG' then 'CAT'

    when source_column_3 = 'RED' then 'Blue'

    when source_column_3 in ('UP','DOWN') then 'Strange'

    when source_column_4 = 'ICE' then 'Tea'

    when source_column_4 = 'PHX' then 'Phoenix'

    when source_column_4 = 'YES' then 'Indeed'

    when source_column_4 in ('CAR','BOS') then 'Stuff'

    end

    Then, for example:

    SELECT target_column, ...

    FROM ...

    WHERE

    target_column IN ('Tea', 'Indeed')

    ORDER BY

    target_column

    Nice idea, but I can't see how such a solution can reasonably be offered to business users, since it requires an ALTER TABLE every time the user wishes to update the meta-data.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)

    My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.

    😎

  • Eirikur Eiriksson (10/21/2014)


    I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)

    My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.

    😎

    Here's[/url] an example. For this case:

    Is dynamic SQL avoidance possible? Yes.

    Does dynamic SQL perform better than the alternatives? Yes.

    So I would say 'use with caution' rather than 'avoid if possible'.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (10/21/2014)


    Eirikur Eiriksson (10/21/2014)


    I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)

    My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.

    😎

    Here's[/url] an example. For this case:

    Is dynamic SQL avoidance possible? Yes.

    Does dynamic SQL perform better than the alternatives? Yes.

    So I would say 'use with caution' rather than 'avoid if possible'.

    You got a point there!

    Think I may have come across this article in the past... not certain though....:hehe:

    Handing over the control of logic to "the business" normally saves lots and loads of time, normally justifies a slightly slower execution time.

    Purely table driven logic tends to be more robust although somewhat less flexible.

    It generally forces a more of an simplification/axiomatization type approach which is clearer and hence more maintainable.

    Table driven logic goes hand in hand with a pure set based approach.

    😎

  • Phil Parkin (10/21/2014)


    ScottPletcher (10/20/2014)


    If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..

    ALTER dbo.target_table

    ADD target_column AS

    case

    when source_column_1 = 'ABC' then 'XYZ'

    when source_column_2 = '123' then 'PDQ'

    when source_column_3 = 'DEF' then 'BLAH'

    when source_column_3 = 'FOO' then 'Fighters'

    when source_column_3 = 'DOG' then 'CAT'

    when source_column_3 = 'RED' then 'Blue'

    when source_column_3 in ('UP','DOWN') then 'Strange'

    when source_column_4 = 'ICE' then 'Tea'

    when source_column_4 = 'PHX' then 'Phoenix'

    when source_column_4 = 'YES' then 'Indeed'

    when source_column_4 in ('CAR','BOS') then 'Stuff'

    end

    Then, for example:

    SELECT target_column, ...

    FROM ...

    WHERE

    target_column IN ('Tea', 'Indeed')

    ORDER BY

    target_column

    Nice idea, but I can't see how such a solution can reasonably be offered to business users, since it requires an ALTER TABLE every time the user wishes to update the meta-data.

    You could create a proc that the business users call and pass in the definition. That proc could run with enhanced authority on that table.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thanks for all of the responses. Lots of interesting ideas. The computed column approach is probably the most elegant, but it was met with a resounding "Nope" when proposed. The business wants a table-driven solution. So I'm going with the table/dynamic SQL approach (keeping all the caveats in mind). Thanks again to everyone for the replies.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply