I've had this one in my toolbox for a while now, decided to
publish it on CodePlex a while back, and am now getting around to blogging about it. If you've seen
this Connect submission, you've got an idea as to why I've constructed it. Jamie Thomson (
blog|
twitter) and I had a good discussion on the topic there. It helps me solve the issues I regularly face converting batches of columns - usually between Unicode and Non-Unicode. I'd
developed a pattern for converting columns using standard SSIS components, but it still fell short on usability.
If you understand what the Data Conversion component does, you'll have a good start at understanding what the Replacing Data Conversion component does - and perhaps why I had such a hard time coming up with a name for it. I'd like to take this space to explain why I created it - and what kinds of small changes could be made to the stock component in SSIS that could have eliminated that need.
Same-Name, In-Place Conversion
One of the big issues I had with the Data Conversion component was the naming problem. I regularly used the original column - usually named exactly how I wanted it, and placed in the column list exactly where I expected it - even though it was the wrong data type, and I had a converted column available further down the column list, with a different name. That was the primary rationale for my data conversion pattern.
Using the Replacing Data Conversion component fixes that problem. The columns output from the component match those that enter - with respect to column names and positions. The only change in the output flow is the data type of the columns you've converted. You can't possibly make a mistake later on in your flow.
Quick Bulk Conversions
Another problem I frequently had with the Data Conversion component was that I typically had to convert multiple columns in exactly the same fashion. Either I had to convert a bunch of Unicode string (DT_WSTR) columns to Non-unicode (DT_STR), or a bunch of floats (DT_R4) to exact numeric types (DT_NUMERIC). Doing this with the Data Conversion component was a long series of clicks - and even more clicks and typing if I followed my own best practice.
The Replacing Data Conversion component fixes that problem too. It has two "modes" - a simple mode, where you can define a blanket conversion such as "all DT_WSTR to DT_STR", and apply that to your column set. You can filter only columns with specific data types, or leave the bulk conversion wide open to catch all matching types. Quite convenient for those wider tables, I've got to tell you.
The Drawbacks
Of course, like anything I do - rest assured it's not all unicorns and rainbows. Here's what you sacrifice by using this component.
A Little Performance
This component is an asynchronous component. In slightly less technical terms, it doesn't just modify the data as it flies by in the data flow pipeline. It has to copy all of the data in each buffer to a newly formatted buffer. That's going to incur some additional memory I/O and CPU activity that you wouldn't see with a Data Conversion component, as well as change how the optimizer sees the flow. That said, this is not a "worst case" asynchronous component like the Sort. It's not going to hold up all the rows in the flow, just one buffer at a time similar to the Union All component. For myself, the improvement in the design experience is worth it.
An Imperfect Interface
Despite my best intentions, I don't get a lot of time to polish these things off. Despite being somewhat of a perfectionist - especially when exposing my work to the world and building a "brand" - the interface is definitely not as good as I'd want it.
No Quick Installer
For the same reason as above, I don't have the time to construct an installer. And to be honest, even if I did, you'd still have to do some manual work to get it completely installed. (Good news on that front coming up.)
Happy Converting
On the whole, I find it very useful. I hope you do too - and please keep suggestions on improvements coming...