July 23, 2009 at 2:39 am
Hope you can all help,
I use SSIS and am looking for a structured best practice for loading in files (both Excel file source and flat files). Currently I use a package that opens the files followed by data conversion and derived column tasks. I am aware that if there are changes to the files that this would ensure the package would fail followed by a lot of hard work to fix it.
What I am looking for is what have people done to get around potential problems. Do they have tables that hold definitions for how to convert the files? Do they use some sort of configuration file to hold file details? Do they use stored procedures as they find this easier to modify?
Many Thanks.
Ells
😎
July 23, 2009 at 2:58 am
Are you asking for how to automatically deal with structure changes to source files (eg, change in number of fields, max length of fields or data types of fields)?
Or how to be flexible, in the knowledge that these things will change?
The more flexible you want the solution to be, the longer it will take to develop. If your source data isn't changing very often, I would expect it to be uneconomical to try and handle the changes via any method other than updating the package through BIDS.
Not sure quite what sort of answer you are hoping to hear ...
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
July 23, 2009 at 4:38 am
Phil,
thanks for the reponse. I tried to leave the question as open as possible. I guess I am looking to install some level of robustness, also to make it easy to diagnose issues and correct them.
My first thoughts are to load the flat file into a staging table with default column names (column 0, column 1 etc) and default column sizes (nvarchar(50)). The next stage would be a table with columns for the name of the staging table, name of the column in the staging table, name of the column in the target table , column for if the column is required, column containg the cast required. This linked with a stored procedure.
Just interested to hear some ideas.
Ells.
😎
July 23, 2009 at 7:55 am
Be careful about trying to develop a package that can handle too many files changes. Such a package might be able to run "successfully" even if the order of the columns get changed by accident. I say "successfully" meaning that it runs without execution errors. Consider what might happen in this case it the package starts loading data into the wrong columns.
I prefer to develop packages that fail if the structure of the source data changes. This way someone will be alerted about the potential problem and you greatly reduce the risk of producing erroneous results.
... just my 3 cents worth ...
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
July 23, 2009 at 8:30 am
Ok,
so if your best practice is to accept that the packages do you do anything in your design to make it easier to resolve once you have a problem.
Sorry about this I know its quite a soul searching question. I may or may not accept that the package failing is good practice as it enforces some checks. However should it fail I want it to be as easy as possible to correct. This is why I am contemplating some stored procedures as I find diagnosing SQL issues easier in management studio rather than in a SSIS package.
Thanks
Ells.
July 24, 2009 at 9:08 am
I use an error re-direct scheme on all key transforms in a package. In each case, the process directs the offending source data row into an error queue table, along with a description of which transform generated the error row. The error queue is inspected after each run.
I bring all columns in as text, but I do constrain them to the expected input field length. If this fails, then I know that there is some issue with the inbound data.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply