December 29, 2011 at 2:23 pm
I have a need to detect if an excel file is missing an expected column before attempting to load the data from the file.
Presently, if the ssis package attempts to load an excel file where a column is missing, then the package fails. I attempted to use a script task at the beginning to test if an excel file connection is successful (see Jamie Thomson's article: http://consultingblogs.emc.com/jamiethomson/archive/2005/10/10/SSIS-Nugget_3A00_-Verify-a-data-source-before-using-it.aspx), but this technique doesn't yield a failure in my situtation as I had hoped so that the data flow could be redirected rather than continuing on with attempting to load the excel data... and ultimately failing.
My plan was to have two different excel file connections (one connection with all required columns and the other connection accounting for the missing column). Some sort of script task would assess if the excel file has all columns (or not), and then direct data flow to the appropriate sequence container to load the data.
So, far, I've had zero success coming up with a dynamic solution.
Surely, others of you have had to face this ugly situation. What works? (FYI -- scripting is not my strong point)
Many thanks in advance!
--Pete
December 30, 2011 at 8:40 am
This might be something you've already tried, but if not, why don't you try this:
Have two data flow tasks. One uses the excel source that has all the columns, and the other uses the excel source that is missing a column. On the excel source, set DelayValidation to true so that you don't have it fail before it starts.
Connect the data flow which is using all the columns to your next task on success, and on failure, connect it to the other data flow task which is using the excel source that is missing the column. Connect that one to the next task on success. Set the on success connection for both, to be using an OR operator instead of an AND.
December 30, 2011 at 9:22 am
I tried something similar to your suggestion using two different excel file connections, but when the initial excel data connection (with all expected columns) comes across a file that's missing a column, the connection fails, which then triggers the package to fail. Even changing the DFT's OnError event to not propagate the error to higher levels doesn't stop the package from failing as I had hoped according to the following article (http://datachix.com/2011/02/08/messing-with-errors-in-ssis-or-does-the-propagate-property-really-exist/[/url])
As I've done loads more internet searching today, it seems that quite a bit of custom script task coding is required. All in all, it seems like a heck of a lot of work for compensating for a missing column of data in Excel. May be easier to handle the one-offs as they come up from time to time, instead... i.e., send an email to the guy who gets sloppy with supplying the files to us.
December 30, 2011 at 9:33 am
Another perhaps clunky solution:
Have two completely separate DTSX files. One will have your excel source with all columns, and the other has your excel source with the missing column.
Have your SSIS project contain two execute package tasks. The first one links to the second one on failure.
If even that doesn't work, then you could go out one step further to your SQL Server Agent task, and have the first step execute the solution with all the columns, and on failure goes to the second step which executes the package containing the excel connection missing the column.
December 30, 2011 at 9:38 am
kramaswamy
Thanks for the follow-up suggestions. I hadn't yet investigated cascading packages, but may be worth trying.
thanks,
pete
January 8, 2012 at 12:59 pm
Hi Pete,
If you can use third-party solutions, check the commercial CozyRoc Data Flow Task Plus. It is an extension of the standard Data Flow Task , with support for dynamic columns at runtime. The task can handle the situation you have described and programming skills are required.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply