January 16, 2009 at 10:18 am
Here is my problem : I want to do a REAL "Case Statement" in my code (not only a single If).
I want to do (it is a theoritically example) :
CASE WHEN "MyValue" = 1 THEN "First"
WHEN "MyValue" = 2 THEN "Second"
WHEN "MyValue" = 3 THEN "Third"
ELSE "Bah"
END AS MyOutput
You get it? A normal SQL "Case Statement" in SSIS tool. However, my source is a FILE, and I don't want to INSERT for that (and then be able to have an OLE_DB source with a SELECT).
I found 2 ways to have my case statement working :
1-Derived column
Using the not trivial at all "Derived column". I really don't like this because of the complexity for the reader (and the one who want to modify the script). I would be something like :
MyValue==1?"First":MyValue==2?"Second":MyValue==3?"Third":"Bah"))))
It is not normal having a fear of getting more "case". Imagine this statement with 10 CASE WHEN.
2-Script transformation
The other way is using the script Transformation. It is the most "clear to read". To achieve my initial CASE requirement, the code would look like:
Select case(row.MyValue)
case 1
row.MyOutput = "First"
case 2
row.MyOutput = "Second"
case 3
row.MyOutput = "Third"
else
row.MyOutput = "Bah"
End Select
Good! This is the less ugly! However, YOU ARE PLAYING IN .NET ! I have to teach SSIS to ETL beasts (that already know Data Stage or Informatika). They laugh when I say that they will be learning .NET ! Worst, nothing are set to help the programmers to find their way out with .NET. When you open the "script transformation", nothing to help you find "input", "Output" and "commonly used statement".
Am I missing something or there is no easy way to do a poor little "SELECT CASE...WHEN...THEN..." with multiple test in SSIS?
Simon L-Deslauriers
January 16, 2009 at 10:56 am
First, to handle your simple example, you could create a table that had "MyValue" and "MyOutput" as columns and simply use a lookup component. Not only would this be easier than either of your solutions, but it would allow you to handle additional cases without changing your package.
Now, assuming you have a far more complicated CASE, another option would be to use a conditional split. Handle the case in the conditionals and use a simple derived column on each output to add the "MyOutput" and then finally use a UNION ALL to put it all back together. Although this is a lot more components, it is not going to add a lot of overhead and will be really easy to follow in the data flow.
Now, the unfortunate reality for your situation is that ETL developers using SSIS will eventually need to dive into .Net at least at a basic level. The script component is too important to simply try to avoid all of the time. Anyone complaining that it is too hard to learn needs to get a reality check. Part of a developer's job is to learn to use new technology. So, you may really want to use this as an opportunity to get your people on-board - bring in a .Net trainer for a basic 1-day class for everyone.
January 16, 2009 at 11:25 am
You might want to use a Lookup transform.
Instead of putting the logic into a case statement you can create a table with MyValue and MyOutput columns. Then run the output from your file into the lookup, join the two data sets on MyValue and return MyOutput from the reference table.
To handle the “ELSE Bah” condition (when there is no match in the reference table) you can set the lookup to ignore errors. Then add a derived column task after the lookup with something like: ISNULL(MyOutput) ? “Bah”: MyOutput.
This gives you the advantage that when you add or change MyValue conditions you don’t have to touch the code you can just insert/update a reference table.
January 16, 2009 at 11:28 am
January 16, 2009 at 11:30 am
Oops, Michael added his post before I refreshed. That's what happens when work gets in the way of forum posting! 😀
January 16, 2009 at 11:31 am
Eric Klovning (1/16/2009)
Oops, Michael added his post before I refreshed. That's what happens when work gets in the way of forum posting! 😀
I beat you and John to it today.
January 16, 2009 at 11:40 am
Funny. I didn't see Eric's post for the same refresh reason. I did see Michael's but I just wanted to second the Conditional Split transform.
January 16, 2009 at 12:07 pm
eheh, who's first who's last to post! 😉
So, for the SSIS community here, should we say to everyone to use the conditionnal split for a CASE Statement?? Is it the best way?
SELECT CASE with conditionnal split:
Can you be more precise on how to to SELECT CASE with conditionnal split (+derived columsn for each output +union all)?
Sincerly, I think is is a little bit too much component for a soo little requirement. Correct me if i am wrong : If i have 10 case statement (for one input columns), I will have 10 "derived columns" re-united again in an Union all?
What if i have 3 sources columns being "select cased" + 1 lookup?
SELECT CASE using Lookup transfo and some reference table
About the lookup table, it is a good idea. If the clients are ok with creating some "reference table". However, if the customer requirement change slightly in time, the lookup table wont be able to adapt to this :
case when MyInput="1" then ...
when MyInput="2" and MyInput02="ok" then...
when MyInput="2" and MyInput02="hm" then...
when MyInput="2" and MyInput02="no" then...
when myInput="3" then ....
.....
Sorry to be so picky (?) but I really want to create some "best practice" for my student.
In good old SQL, it would fit in ONE little script. In SSIS, it is a complex ETL...daah
Anyway, to get to the bottom of this, I was firstly wondering if I was missing something to achieve a case statement in SSIS. For the moment, the "Script Transformation" is the cleanest and flexible way to do it.
.Net it will be!
Thanks folks!
P.S. Does Microsoft is planning to add some usefull and day-to-day used components like this one in future release of SSIS?
January 16, 2009 at 12:21 pm
If you can use a lookup - use a lookup. Remember, you don't need an actual table for a lookup - your query could simply be a bunch of select statements unioned together:
[font="Courier New"]SELECT 1 AS MyVal, 'Test' AS MyDesc
UNION ALL SELECT 2 AS MyVal, 'Test2' AS MyDesc[/font]
Having a table is nice because you would not have to open the package to add a new case value.
Now, your new example is still possible with a lookup because you can join on two fields (MyInput and Myinput02) to get your value. Where you would have trouble with a lookup is when you have something like:
[font="Courier New"]CASE
WHEN MyInput="1" THEN "A"
WHEN MySecondInput="1" THEN "B"
WHEN MyInput="2" AND MyColor="Red" THEN "C"
ELSE "F"
END[/font]
In these situations, you can get into something that needs to be hard-coded into your package.
Typically, I would recommend going to a script component - especially if the number of CASE items is pretty high. I can see some benefit to using a CONDITIONAL SPLIT - mostly because it puts the logic clearly in the data flow so it is really easy to follow.
January 16, 2009 at 12:23 pm
You could also use multiple derived column components to keep your expression simple. Put each case into it's own derived column and put them in the correct order.
I cannot see MS doing anything else for a CASE statement. This is a case in which there are so many ways to do it already that creating a component specifically for it is probably not worthwhile.
February 11, 2013 at 10:43 am
I had to convert this to ssis:
WHEN CAST(T1.[ORIG_DT] as datetime) >= T.[SETTLE_dt] THEN '1'
WHEN CAST(T1.[ISSUE_DT] as datetime)>= T.[settle_dt] THEN '3'
WHEN CAST(T1.[ISSUE_DT] as datetime) >= T.[Output_dt] AND T1.[ISS_STK] = '1' THEN '4'
ELSE NULL
END AS [offering_type_cd]
I did the following:
1) conditional split
2) 4 derived columns
3) union all
The part that was confusing was that in the "union all" all of the other columns in the dataset showed up. I thought only the derived column (OfferingTypeCode) would appear in the union all.
Also, I wasn't sure how to get NULL as one of the conditions. Currently, in the conditional split the last order I have is condition 1==0 and created Output OfferingTypeCode5. The expression in the derived column OfferingTypeCode5 is NULL(DT_I4). Is this correct?
Thanks!
February 12, 2013 at 6:57 am
Do I need to provide more information? haven't received a reply. Thanks.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply