October 13, 2014 at 9:28 am
From a best practices standpoint, when dealing with an ETL process for Database -> Flat file, where should the data formatting occur - in TSQL or in SSIS? At a high level I would believe the ETL processing software should be handling this, but I've yet to find an efficient yet easily maintainable way to do this other than derived columns (and you only get one line to do what could be complicated string functions). It's actually preferably for me to just use a script task and use String.Format function, but from a supportability standpoint I try to use Script task as little as possible.
Just throwing this question out there to see if there is a middle ground between efficient development by Dev DBAs and efficient support by Support DBAs.
October 13, 2014 at 9:40 am
I guess it depends on what kind of formatting you are talking about.
All data should be pulled from SQL using a query. If the formatting can easily be done in the query then I would normally do it there. If the formatting requires complex logic, or time consuming user defined functions, then doing the work in the SSIS data flow might be a better option. Also, sorting should normally be done in the SQL query.
That's my 3 cents worth.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
October 13, 2014 at 10:24 am
Comment deleted. I answered the wrong question! +1 to Alvin's answer.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 14, 2014 at 11:24 am
Alvin, I agree, but that makes it difficult to troubleshoot and debug when you might have some transformation in the source SQL query, but other transformations in the SSIS itself. I'm trying to see if there is a way to modularize the transformation into one location.
In the end the best answer I can come up is what you said - it depends :hehe:
October 15, 2014 at 9:08 am
Well - to your piont about troubleshooting - for any given outbound file, I would not be looking to have BOTH involved heavily in the transformation/formatting. To Alvin's earlier point - the SQL query is usually good enough to handle the bread and butter formatting. So a vanilla CSV would be easy either way, and all things being equal I'd probably do that in the query.
If on the other you need to add in custom headers or go to something outside of what T-SQL can do easily, then set up the query to simply serve up the data with NO formatting, and do all outbound formatting and transforms in SSIS.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 15, 2014 at 4:05 pm
So pretty much we're stuck with either Script Task or Derived Columns?
October 15, 2014 at 8:05 pm
have you tried the flat file destination? lots of config options in there before you ever need scripting.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 16, 2014 at 12:20 am
Gabriel P (10/15/2014)
So pretty much we're stuck with either Script Task or Derived Columns?
I'm not sure how a script task would help you here.
I think you may mean a script component.
Within SSIS, that is probably true.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply