October 20, 2014 at 10:06 am
Hi Folks
I have a stored proc that contains an update which utilizes a case statement to populate values in a particular column in a table, based on values found in other columns within the same table. The existing update looks like this (object names and values have been changed to protect the innocent):
set target_column =
case
when source_column_1= 'ABC'then 'XYZ'
when source_column_2= '123'then 'PDQ'
when source_column_3= 'DEF'then 'BLAH'
when source_column_3= 'FOO'then 'Fighters'
when source_column_3= 'DOG'then 'CAT'
when source_column_3= 'RED'then 'Blue'
when source_column_3in ('UP','DOWN') then 'Strange'
when source_column_4= 'ICE'then 'Tea'
when source_column_4= 'PHX'then 'Phoenix'
when source_column_4= 'YES'then 'Indeed'
when source_column_4in ('CAR','BOS') then 'Stuff'
end
The powers that be would like to replace this case statement with some sort of table-driven structure, so that the mapping rules defined above can be maintained in the database by the business owner, rather than having it embedded in code and thus requiring developer intervention to perform changes/additions to the rules.
The rules defined in the case statement are in a pre-defined sequence which reflects the order of precedence in which the rules are to be applied (in other words, if a matching value in source_column_1 is found, this trumps a conflicting matching value in source_column_2, etc). A case statement handles this nicely, of course, because the case statement will stop when it finds the first "hit" amongst the WHEN clauses, testing each in the order in which they are coded in the proc logic.
What I'm struggling with is how to replicate this using a lookup table of some sort and joins from the target table to the lookup to replace the above case statement. I'm thinking that I would need a lookup table that has column name/value pairings, with a sequence number on each row that designates the row's placement in the precedence hierarchy. I'd then join to the lookup table somehow based on column names and values and return the match with the lowest sequence number, or something to that effect. Has anyone done something like this before? Is it even worth doing? I can feel myself going down a road of multiple joins to the lookup table aliased multiple times, and having that update statement getting awfully klugey. Or should I instead be trying to convince TPTB that the best approach is to keep the logic embedded in the code and maintain it via service desk requests? (The data is not likely to change more than a few times a year.)
Thanks!
October 20, 2014 at 10:29 am
Wow, good question.
If you were to model this in the database as you suggest, I would think that the best-performing solution would be one which generated dynamic SQL to perform the update in much the same way as you have done. That would be fun to code!
Let's see what others come up with.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 20, 2014 at 10:43 am
Quick "skeleton" suggestion
😎
USE tempdb;
GO
;WITH BASE_DATE(BD_ID,BD_FLAG) AS
( SELECT * FROM (VALUES
(1,'ABC' )
,(2,'123' )
,(3,'DEF' )
,(4,'FOO' )
,(5,'DOG' )
,(6,'RED' )
,(7,'UP' )
,(8,'DOWN')
,(9,'ICE' )
,(10,'PHX' )
,(11,'YES' )
,(12,'CAR' )
,(13,'BOS' )
) AS X(BD_ID,BD_FLAG)
)
,TRANSLATE_SET(V_BEFORE,V_AFTER) AS
(SELECT * FROM (VALUES
('ABC','XYZ' )
,('123','PDQ' )
,('DEF','BLAH' )
,('FOO','Fighters' )
,('DOG','CAT' )
,('RED','Blue' )
,('UP' ,'Strange' )
,('DOWN','Strange')
,('ICE','Tea' )
,('PHX','Phoenix' )
,('YES','Indeed' )
,('CAR','Stuff' )
,('BOS','Stuff' )
) AS X(V_BEFORE,V_AFTER)
)
SELECT
BD.BD_ID
,BD.BD_FLAG
,'BECOMES' AS TR_ACTION
,TS.V_AFTER
FROM BASE_DATE BD
LEFT OUTER JOIN TRANSLATE_SET TS
ON BD.BD_FLAG = TS.V_BEFORE;
Results
BD_ID BD_FLAG TR_ACTION V_AFTER
----------- ------- --------- --------
1 ABC BECOMES XYZ
2 123 BECOMES PDQ
3 DEF BECOMES BLAH
4 FOO BECOMES Fighters
5 DOG BECOMES CAT
6 RED BECOMES Blue
7 UP BECOMES Strange
8 DOWN BECOMES Strange
9 ICE BECOMES Tea
10 PHX BECOMES Phoenix
11 YES BECOMES Indeed
12 CAR BECOMES Stuff
13 BOS BECOMES Stuff
October 20, 2014 at 11:53 am
Thanks for the replies. I forgot to mention one further complication that does occur in the real data: You can see the same value in different source columns, but have that map to a different output value depending on which source column the value is in. For example:
WHEN source_column_1 = 'ABC' then 'XYZ'
WHEN source_column_3 = 'ABC' then '999'
Because of that condition, I don't think that the potential solution from Eirikur will work. (Or I'm just not seeing how to manage that scenario using that methodology.)
October 20, 2014 at 12:18 pm
Quick addition of another key element (SRC_ID)
😎
USE tempdb;
GO
;WITH BASE_DATE(BD_ID,SRC_ID,BD_FLAG) AS
( SELECT BD_ID,SRC_ID,BD_FLAG FROM (VALUES
(1 ,1,'ABC' )
,(2 ,1,'123' )
,(3 ,1,'DEF' )
,(4 ,2,'ABC' )
,(5 ,2,'123' )
,(6 ,2,'DEF' )
,(7 ,3,'ABC' )
,(8 ,3,'123' )
,(9 ,3,'DEF' )
,(10,4,'ABC' )
,(11,4,'123' )
,(12,4,'DEF' )
) AS X(BD_ID,SRC_ID,BD_FLAG)
)
,TRANSLATE_SET(SRC_ID,V_BEFORE,V_AFTER) AS
(SELECT * FROM (VALUES
(1,'ABC','XYZ' )
,(1,'123','PDQ' )
,(1,'DEF','BLAH' )
,(2,'ABC','Fighters' )
,(2,'123','CAT' )
,(2,'DEF','Blue' )
,(3,'ABC','Strange' )
,(3,'123','Strange')
,(3,'DEF','Tea' )
,(4,'ABC','Phoenix' )
,(4,'123','Indeed' )
,(4,'DEF','Stuff' )
) AS X(SRC_ID,V_BEFORE,V_AFTER)
)
SELECT
BD.BD_ID
,BD.SRC_ID
,BD.BD_FLAG
,'BECOMES' AS TR_ACTION
,TS.V_AFTER
FROM BASE_DATE BD
INNER JOIN TRANSLATE_SET TS
ON BD.BD_FLAG = TS.V_BEFORE
WHERE BD.SRC_ID = TS.SRC_ID ;
Results
BD_ID SRC_ID BD_FLAG TR_ACTION V_AFTER
----------- ----------- ------- --------- --------
1 1 ABC BECOMES XYZ
2 1 123 BECOMES PDQ
3 1 DEF BECOMES BLAH
4 2 ABC BECOMES Fighters
5 2 123 BECOMES CAT
6 2 DEF BECOMES Blue
7 3 ABC BECOMES Strange
8 3 123 BECOMES Strange
9 3 DEF BECOMES Tea
10 4 ABC BECOMES Phoenix
11 4 123 BECOMES Indeed
12 4 DEF BECOMES Stuff
October 20, 2014 at 12:54 pm
Thanks again, Eirikur. I've been thinking along similar lines as far as how the table structures would look. The fun part (as Phil suggested) is figuring out how to enforce the hierarchy in this setup. A row in the target table might contain data that meets multiple mapping rules, and I want to be sure that I always get the appropriate match based on the hierarchy established in the business rules. I think I'd either have to perform multiple updates (one update statement for each source column value, not the optimal solution in my opinion), or use dynamic SQL to build a case statement-based update that looks like what I started out with, but using this table as a driver. I'd probably add a sequence number to the table to use as an ordering mechanism when building the dynamic sql. Does that make sense or is there a simpler solution that I'm not seeing here?
October 20, 2014 at 2:32 pm
Exactly what I had in mind.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 20, 2014 at 2:34 pm
So here's what I came up with (thanks for the nudge in the right direction, all). I think this will do the trick.
Lookup table:
CREATE TABLE [dbo].[Target_Value_Map](
[map_rule_sequence_no] [int] NOT NULL,
[target_column_name] [varchar](50) NOT NULL,
[source_column_name] [varchar](50) NOT NULL,
[source_column_value] [varchar](50) NOT NULL,
[target_column_value] [varchar](50) NOT NULL,
CONSTRAINT [PK_Target_Value_Map] PRIMARY KEY CLUSTERED
(
[map_rule_sequence_no] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Dynamic SQL to generate Update script with CASE statement, driven by lookup table:
declare @sql varchar(1000)
declare @source_column_name varchar(50)
declare @source_column_value varchar(50)
declare @target_column_value varchar(50)
set @sql = 'update target_table set target_column = case '
declare dynamicSQLBuild_CSR cursor for
select source_column_name,source_column_value,target_column_value
from dbo.Target_Value_Map
where target_column_name = 'target_column'
order by map_rule_sequence_no
open dynamicSQLBuild_CSR
fetch next from dynamicSQLBuild_CSR into @source_column_name,@source_column_value,@target_column_value
WHILE @@FETCH_STATUS = 0
BEGIN
set @sql = @sql + 'when ' + @source_column_name + ' = ''' + @source_column_value + ''' then ''' + @target_column_value + '''' + ' '
fetch next from dynamicSQLBuild_CSR into @source_column_name,@source_column_value,@target_column_value
END
close dynamicSQLBuild_CSR
deallocate dynamicSQLBuild_CSR
select @sql
The output of the script looks like this:
update dbo.target_table
set NC_Segment =
case
when source_column_1 = 'BLA' then 'BLABLABLA'
when source_column_2 = 'ABC' then '123'
when source_column_3 = 'XYZ' then '987'
when source_column_3 = 'DEF' then 'ZZZ'
when source_column_4 = 'FOO' then 'Fighters'
when source_column_4 = 'BOO' then 'Ghost'
when source_column_4 = 'CAT' then 'Dog'
end
Everything in the dynamic case statement is in the right sequence, which was the key issue in all of this. My plan is to add the code snippet above to my existing stored proc, then call sp_executesql @sql to execute the update. This way the rules can be maintained in the table and the proc code will always work (as long as whoever owns the data knows that they need to maintain the map_rule_sequence_no values properly).
Thoughts? If there's an easier way to do this, I'm all for it.
October 20, 2014 at 4:23 pm
If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..
ALTER dbo.target_table
ADD target_column AS
case
when source_column_1 = 'ABC' then 'XYZ'
when source_column_2 = '123' then 'PDQ'
when source_column_3 = 'DEF' then 'BLAH'
when source_column_3 = 'FOO' then 'Fighters'
when source_column_3 = 'DOG' then 'CAT'
when source_column_3 = 'RED' then 'Blue'
when source_column_3 in ('UP','DOWN') then 'Strange'
when source_column_4 = 'ICE' then 'Tea'
when source_column_4 = 'PHX' then 'Phoenix'
when source_column_4 = 'YES' then 'Indeed'
when source_column_4 in ('CAR','BOS') then 'Stuff'
end
Then, for example:
SELECT target_column, ...
FROM ...
WHERE
target_column IN ('Tea', 'Indeed')
ORDER BY
target_column
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 21, 2014 at 12:08 am
ScottPletcher (10/20/2014)
If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..
ALTER dbo.target_table
ADD target_column AS
case
when source_column_1 = 'ABC' then 'XYZ'
when source_column_2 = '123' then 'PDQ'
when source_column_3 = 'DEF' then 'BLAH'
when source_column_3 = 'FOO' then 'Fighters'
when source_column_3 = 'DOG' then 'CAT'
when source_column_3 = 'RED' then 'Blue'
when source_column_3 in ('UP','DOWN') then 'Strange'
when source_column_4 = 'ICE' then 'Tea'
when source_column_4 = 'PHX' then 'Phoenix'
when source_column_4 = 'YES' then 'Indeed'
when source_column_4 in ('CAR','BOS') then 'Stuff'
end
Then, for example:
SELECT target_column, ...
FROM ...
WHERE
target_column IN ('Tea', 'Indeed')
ORDER BY
target_column
Nice idea, but I can't see how such a solution can reasonably be offered to business users, since it requires an ALTER TABLE every time the user wishes to update the meta-data.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 21, 2014 at 12:25 am
I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)
My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.
😎
October 21, 2014 at 12:41 am
Eirikur Eiriksson (10/21/2014)
I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.
😎
Here's[/url] an example. For this case:
Is dynamic SQL avoidance possible? Yes.
Does dynamic SQL perform better than the alternatives? Yes.
So I would say 'use with caution' rather than 'avoid if possible'.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 21, 2014 at 1:11 am
Phil Parkin (10/21/2014)
Eirikur Eiriksson (10/21/2014)
I strongly belief that dynamic SQL solutions should be avoided if possible, it normally doesn't take much of an additional complexity to render them useless or at least very hard to maintain. Only in some very rare and exceptional cases have I had to use dynamic sql for these purposes, usually around xml schema collections or other items of lesser sql purity;-)My suggestion is to accurately define the requirements and the logic, afterwards one can piece together a solution.
😎
Here's[/url] an example. For this case:
Is dynamic SQL avoidance possible? Yes.
Does dynamic SQL perform better than the alternatives? Yes.
So I would say 'use with caution' rather than 'avoid if possible'.
You got a point there!
Think I may have come across this article in the past... not certain though....:hehe:
Handing over the control of logic to "the business" normally saves lots and loads of time, normally justifies a slightly slower execution time.
Purely table driven logic tends to be more robust although somewhat less flexible.
It generally forces a more of an simplification/axiomatization type approach which is clearer and hence more maintainable.
Table driven logic goes hand in hand with a pure set based approach.
😎
October 21, 2014 at 8:57 am
Phil Parkin (10/21/2014)
ScottPletcher (10/20/2014)
If the controlling column(s) are all from the same table, I suggest a computed column. That leaves the defining logic in only a single place, and allows the calc'd value to be used just like any other column: in SELECT list, WHERE clause, ORDER BY clause, etc..
ALTER dbo.target_table
ADD target_column AS
case
when source_column_1 = 'ABC' then 'XYZ'
when source_column_2 = '123' then 'PDQ'
when source_column_3 = 'DEF' then 'BLAH'
when source_column_3 = 'FOO' then 'Fighters'
when source_column_3 = 'DOG' then 'CAT'
when source_column_3 = 'RED' then 'Blue'
when source_column_3 in ('UP','DOWN') then 'Strange'
when source_column_4 = 'ICE' then 'Tea'
when source_column_4 = 'PHX' then 'Phoenix'
when source_column_4 = 'YES' then 'Indeed'
when source_column_4 in ('CAR','BOS') then 'Stuff'
end
Then, for example:
SELECT target_column, ...
FROM ...
WHERE
target_column IN ('Tea', 'Indeed')
ORDER BY
target_column
Nice idea, but I can't see how such a solution can reasonably be offered to business users, since it requires an ALTER TABLE every time the user wishes to update the meta-data.
You could create a proc that the business users call and pass in the definition. That proc could run with enhanced authority on that table.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 21, 2014 at 9:16 am
Thanks for all of the responses. Lots of interesting ideas. The computed column approach is probably the most elegant, but it was met with a resounding "Nope" when proposed. The business wants a table-driven solution. So I'm going with the table/dynamic SQL approach (keeping all the caveats in mind). Thanks again to everyone for the replies.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply