December 29, 2010 at 1:16 pm
I dislike Integration Services. Maybe it is just the fact I am a programmer, but the product appears ridiculously over-engineered. It also lacks true programmatic hooks, as far as I can tell.
Basic architectural issues with Integration Services I've had:
* Often we'll have databases with different names for things, but given an ontological mapping, I can tell you that column XYZ in database J is conceptually the same as column ABC from database K. SSIS appears to have no way to express this. Usually the graphics B.S. just makes it harder to view these important details and reason about that.
* We create "software factories" and "software product lines" in order to provide customized software to customers while reusing as much front-end and back-end logic as possible. SSIS seems to fight this goal.
* I perceive the graphical bells and whistles to mainly serve the purpose of organizing bureaucratic goals, since business people don't like looking at code but stuff that looks closer to Gantt charts is less likely to flabberghast them. People like the illusory feeling that they are in control of the developers/DBAs.
Lately, I've revisited this topic when somebody asked me why I don't like SSIS. Traditionally, I'd explain those personal problems and also link to Oren Eini's Why I Hate SSIS rant, and even more recently I just link them to the redneck Google search result for why i hate ssis.
Most of the complaints Oren and I have can be distilled down to the horrible graphical front-end and poor debugging facilities. I guess I am just wondering, does anyone ditch the graphical front-end, and, if so, how do you do it?
December 29, 2010 at 1:44 pm
You can use variables to modify connection options like server, table and column names in some SSIS objects. It really depends on what you are trying to achieve.
I read a bunch of that "I Hate SSIS" list you linked to. I disagree with some of the complaints. For example, he complains constantly (several entries) about how SSIS will ask you to confirm that changes you made in a connection string don't mean you want to redo your queries and such for that connection. This can all be avoided by changing one setting in the SSIS package properties. It takes about 1 second to change it, and this is mentioned in the documentation.
He's also wrong about the security needs. I don't know why he thinks SSIS packages need sysadmin rights to run inside jobs scheduled through SQL Agent. They don't. I can't tell what he's doing wrong from the complaint (he doesn't amplify on that point), but simply put, the statement is just plain wrong.
I don't know what UI he's using for SSIS. BIDS doesn't have half the problems he complains about, and BIDS is the default environment for building these. I can't even figure out some his complaints, because I can't reproduce them in BIDS. For example, the complaint about errors in script tasks being "displayed" in hidden pop-ups (pop-unders?). I've never had that happen, and I've been using SSIS since SQL 2005 was in beta. If the script contains a pop-up/modal dialog of some sort, which it generally shouldn't (unless the package is meant to only be run attended, which is uncommon to the point of being useless), it might do that. I've never tried using dialogs in SSIS packages and can't see why anyone would.
He also complains about column transformations. None of the complaints are valid in any version of BIDS I've ever used.
I've also never had his problems with source control. He probably has that misconfigured in some way. I've always used source control for SSIS, and never had any of the problems he complains about.
He complains that SSIS can only use VB.NET as "the only way to write actual code". SSIS can access and use any .NET DLL. Again, he needs to RTFM. Not a problem with the product.
He obviously didn't bother to look into the ability to handle data errors at all. There are examples in BOL (same on MSDN). It's extremely easy. His complaint about finding "the 1 row in 1,000,000" that has a problem usually takes me a few seconds, mainly depending on what I'm trying to do with the data and the power of the machine I'm working with.
I'm by no means a serious SSIS expert, and even I can tell that this guy really just likes to complain and doesn't want to read the documentation when he has a problem. Complaining is more fun, no question about that. It's nice to vent periodically. But it's doesn't solve the problem.
(Sorry if this goes on a bit. His list is quite long and eggregiously erroneous. SSIS is a nice tool, but it's just a tool, and definitely worthy of neither hate nor love.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 29, 2010 at 4:01 pm
I don't think he is someone who likes to complain. He is considered a very good .NET developer*, but he is a developer. Ergo, my qualification earlier that maybe it is just because I am a developer that I find SSIS ridiculously over-engineered.
Anyway, the point of my original post was to really try to give SSIS a second chance. We do a ton of ETL stuff here, and we do it reasonably efficiently, but I would be a major hero if we could improve that even further.
Let me put this another way. If I could burn everything I know in my brain about SSIS to dust and start over... how should I do it? Bare in mind I want to touch the GUI as little as possible, since I do not find it useful. This much I am pretty much certain of; the GUI is a waste of time.
We've written our own ETL tool suite here in the past, that would do a topological sort on TSQL statements annotated with
/* Updated: List of < Table_Name.Column_Name >
Predicates: List of < Table_Name.Column_Name >
*/
and detect cycles in the update graph automatically, forcing the ETL engineer to re-think what gets done when. We would even automatically display the dependency graph using graphviz[/url] (open source, free graphing tool).
This really isn't meant to be an adversarial question. I am always looking to learn. But a serious blocking point is that SSIS seems engineered to be a waste of time / full-time job just to maintain. I don't think I've gotten this perception due to being clueless.
What was the ramp-up time for you to use SSIS effectively?
* Oren created NHibernate, the competitor to Entity Framework
December 29, 2010 at 4:45 pm
GSquared (12/29/2010)
I don't know what UI he's using for SSIS. BIDS doesn't have half the problems he complains about, and BIDS is the default environment for building these. I can't even figure out some his complaints, because I can't reproduce them in BIDS.
Just to clarify, he is using "VS" for Visual Studio. "BIDS" (Business Intelligent Development Studio) is Visual Studio! BIDS just uses the Visual Studio shell and doesn't have a C#, C++, etc. compiler by default. So when he says "VS" he is really saying BIDS. He is just saying VS because he is a developer, not a DBA, so he thinks of the product as VS.
SSMS is also basically Visual Studio.
December 30, 2010 at 7:20 am
johnzabroski (12/29/2010)
GSquared (12/29/2010)
I don't know what UI he's using for SSIS. BIDS doesn't have half the problems he complains about, and BIDS is the default environment for building these. I can't even figure out some his complaints, because I can't reproduce them in BIDS.Just to clarify, he is using "VS" for Visual Studio. "BIDS" (Business Intelligent Development Studio) is Visual Studio! BIDS just uses the Visual Studio shell and doesn't have a C#, C++, etc. compiler by default. So when he says "VS" he is really saying BIDS. He is just saying VS because he is a developer, not a DBA, so he thinks of the product as VS.
SSMS is also basically Visual Studio.
Yes, BIDS is a skin on Visual Studio. But this guy is having problems with it that don't exist and never have, in a standard installation. So, either he has third-party add-ins in it and they are causing the problems, or he's using something else, or he's having self-inflicted problems and blaming the tool. No way to know which from his rant, since he doesn't say.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 30, 2010 at 7:31 am
In reply to what tool to use other than BIDS, in order to avoid the GUI: SSIS packages are just XML files under the hood. You can edit/build/fold/spindle/mutilate them with whatever XML tool you like, including Notepad if you insist. The GUI just builds the XML for you. However, I don't believe the specifications for the XML files are published (as per common Microsoft practice of obscuring everything till they have to open it up or die), so you'd pretty much be reverse-engineering it on your own, with the usual caveats on that.
In reply to how I built up what little skillset I have on SSIS, I just opened up BIDS and, starting from what I wanted to do, I poked around, read BOL, and through a few minutes of trial and error, built a few SSIS packages. Over the years, I've just continued to do that. I've automated whole businesses with SSIS, and building the packages, debugging them, and deploying them, never took more than a few hours. Most take minutes, the really complex ones take a few hours. I've always been happy with that time-span for things that automate whole departments. It seems reasonable to me that I trippled a company's production bandwidth by spending less than an afternoon of work on it. Refactors since then have added up to about another man-hour, over a 3 year period, and the day-to-day operation is done by someone with script-kiddy DBA skills because it rarely requires any human intervention unless someone has done something seriously stupid in user-land.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 30, 2010 at 7:55 am
* Oren created NHibernate, the competitor to Entity Framework
[SARCASM ON]
I don't think anyone involved in the NHibernate development should be allowed to write stuff like "I hate SSIS".
[SARCASM OFF]
Honestly, using HNibernate did more harm than good for some of our 3rd party apps. We even consider redeveloping some of them because of some major performance issues.
But is this strictly because of NHibernate? NO! Would I say that HNibernate per se is crap? NO!
Most of the professional software out there is good in what it is designed for. And sometimes you need to learn how to use.
The problem is, if you don't use it the way it is supposed to be used, then you start to have trouble with it. And if you don't want to learn how to use it, it will only get worse.
Sometimes the problem is not behind the screen, but rather in front of it.
As a side note: I'm sure there are some people out there that can make NHibernate to actually provide some benefit. I just haven't met one yet.... 😎
December 30, 2010 at 10:26 am
Have you seen Varigence's BIML product? You can generate complete SSIS packages using XML specification and you can also use .NET based scripting. If you are .NET -oriented developer , this may be useful.
December 30, 2010 at 11:15 am
LutzM (12/30/2010)
* Oren created NHibernate, the competitor to Entity Framework
NHibernate comments...
I am an expert in object-relational mapping and have written my own proprietary OR Mapper used for financial services analysis, and can tell you exactly what kind of code Hibernate and NHibernate generates, and where in the code I think something is stupid. I can also tell you common implementation flaws across all .NET ORMs.
The design of pretty much all ORMs, in my experience, is stupid. I don't like Entity Framework, either. There are many orthogonal problems in ORM design, such as performance optimization, master metadata management, business requirements modeling, query generation, change tracking, and schema definition, etc. These are all 100% orthogonal, and all frameworks conflate these in subtle ways.
But more often than not, people misuse ORMs and the reason for this is that they don't know how to use the tool. The biggest mistake is probably naively auto-generating .NET classes from the database schema. Most ORMs don't manage that strategy well, since they are designed code-first (some people take this further and use a modeling strategy called Domain-Driven Design). The problem with brute force auto-generation is that then the DAL becomes littered with null objects for cases where the data retrieval doesn't join on a particular relationship (e.g., lazy loading).
But the biggest problem with most ORMs in my books is that they don't allow peep-hole and rule-base optimizations orthogonal to the data fetching API. Entity Framework, NHibernate, LLBLGenPro, WilsonORM, etc. all suffer this same cruel fate. That's because ORMs are not designed by programming language designers and compiler writers. In the case of Entity Framework, they're designed by people who read Terry Halpin's Object-Role Modeling books and the class design reflects that. Entity Framework has better query generation than NHibernate, but both have a limit on how well they can tune queries, since tuning a query before it gets to the server requires building a model for hidden models (e.g., what the query optimizer will do) such that the best query plan is always chosen. Other ORMs like Linq 2 SQL do not even correctly annotate metadata, and so contain ambiguities in how the query is specified for execution.
December 30, 2010 at 11:29 am
CozyRoc (12/30/2010)
Have you seen Varigence's BIML product? You can generate complete SSIS packages using XML specification and you can also use .NET based scripting. If you are .NET -oriented developer , this may be useful.
This looks much more like what I have in mind, but I don't think I can pay for it and don't like the extra level of indirection. In addition, the sales video I watched wasted my time with marketing buzzwords and was clearly not targeted for me.
(Initial) Bottom line:
- How do I debug BimlScript?
- What the hell am I looking at when I see certain text in the script?
- It claims to help ease the creation of cubes, but doesn't explain how that works. I don't use tools I'm not smart enough to understand, or the people who wrote it are not capable/smart enough to explain.
December 30, 2010 at 11:41 am
This looks much more like what I have in mind, but I don't think I can pay for it and don't like the extra level of indirection. In addition, the sales video I watched wasted my time with marketing buzzwords and was clearly not targeted for me.
(Initial) Bottom line:
- How do I debug BimlScript?
- What the hell am I looking at when I see certain text in the script?
- It claims to help ease the creation of cubes, but doesn't explain how that works. I don't use tools I'm not smart enough to understand, or the people who wrote it are not capable/smart enough to explain.
Hmm... you have very strange attitude. First you complain that you don't like the "clunky" UI of BIDS and then you complain that something is too hard for you to understand. Actually I believe you are smart enough to understand the tool, based on the analysis you have just posted about the entity mapping frameworks.
I would recommend you contact Varigence people. They will answer all your questions.
December 30, 2010 at 11:47 am
Sorry, I was fishing for an anecdote from you. Sort of like how somebody might say "I don't get normalization" and a million people on here could explain it to them, I am saying "I don't get most of side-benefits Varigence is claiming to be benefits" and hoping maybe you or somebody else could kind of testify...
Sorry again if my attitude was strange. Looking at it, it definitely wasn't worded well.
Edit: I guess another thing would be if Varigence was an open source company and I could freely edit the code. I tend to buy from those companies, although I know they are rarer in the .NET market place, since most .NET vendors are not services-based and are more traditional software vendors (binary license-based).
December 30, 2010 at 12:20 pm
I'm sorry I hijacked the thread with my NHibernate rant...
Seems like it got slightly off topic because of me.
Would you folks mind ranting about SSIS instead of something else in this thread? ;-):-)
December 30, 2010 at 12:22 pm
There is no forum dedicated to pure ETL questions, and this is the closest forum. I also think my questions are very closely related to SSIS, and as such this is the best forum to ask this question.
Update: I sent varigence an email (analysts@varigence.com), and it bounced back.
December 30, 2010 at 12:32 pm
The forum might be a good choice. But you might want to start a new thread.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply