September 14, 2017 at 3:41 am
Joined this new company which does ETL using c# code instead of SSIS. They said they did this because it's easy to do source control with code instead of SSIS packages (xml files) because the xml file can change easily even you just open and save the package without changing anything. I think they are right from that perspective because they are a bunch of programmers. From my experience, we (BI developers) don't care too much about souce control. Normally just one person work on a package, so there will be no conflict of changes. And I think that's quite rarely for two people working on a same package at the same time?
Anyway, neither approach is better than the other I think. We are two groups of people, programmers vs BI people. We BI people use tools (SSIS), and almost never work on a same piece of code at the same time. Have you experienced something similar?
September 14, 2017 at 4:40 am
Source control is important so you can rollback to a previous version.
c# pro:
no license of SSIS needed
flexibility
c# contra:
requires c# skills
no visual flow control
needs a system to safely store passwords
SSIS pro:
minimal coding needed (most is handled through the gui)
visualized flow control
SSIS catalog where parameters and passwords can be stored safely
Lots of sources
SSIS :
Parallel development hampered by mergability of the xml
SSIS can be quite picky about changed metadata (excel), sometimes failing outright because validation failed
Oddities on the error output (you can add a column on input, output but not on the error flow?)
Backwardscompatiblity between sql server versions
September 14, 2017 at 7:17 am
Given my own lead, I'd avoid both methods and do it all in T-SQL but the C# method doesn't sound bad. I just like to avoid compiled code for such things and I like to avoid SSIS for some of the reasons given.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 14, 2017 at 7:45 am
A free open-source tool called XML Notepad can be used to format or compare XML documents. Also, BIDS Helper's Smart Diff feature. The thing about compiled C# executables is that it becomes more difficult to examine the contents of what's been deployed, and ETL is one of those things that require more operational support and post-deployment configuration than a regular C# application. As a compromise between SSIS and compiled C#, some folks are also coding their ETL in PowerShell scripts. However, as Jeff suggests, ultimately it's the T-SQL steps, the SELECT portion of the Extract and the INSERT of the Load that matters more than what language your team uses to package and duct tape the process together.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
September 14, 2017 at 10:23 am
I've used version control with SSIS packages for many years and never had a problem. There hasn't been too many times I had to go back to a previous version, but it's still a good thing to have.
September 14, 2017 at 11:04 am
I find it is often easier to use C# to define a complex process than having a workflow with many tasks and connections going all over the place. You can actually read the C# code without having to click on all the tasks to examine the properties.
But I will put that C# in an SSIS script task because it then can take advantage of SSIS configuration and logging. I can deploy one copy of the package and easily use it in a SQL Agent job from any other server. I can create an SSIS template with configuration and logging already set up, and pass that around to other developers with a reasonable expectation that they will create packages I can live with.
Granted these are not difficult issues to solve in C# development, especially if you have a disciplined team that can agree on standard methods for logging, configuration, and deployment. Unfortunately I have never been part of that kind of team.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply