Best practice for importing flat files and Excel files.

Question

Best practice for importing flat files and Excel files.

Ells

SSCrazy Eights

Points: 9901
More actions
July 23, 2009 at 2:39 am

#136725

Hope you can all help,
I use SSIS and am looking for a structured best practice for loading in files (both Excel file source and flat files). Currently I use a package that opens the files followed by data conversion and derived column tasks. I am aware that if there are changes to the files that this would ensure the package would fail followed by a lot of hard work to fix it.
What I am looking for is what have people done to get around potential problems. Do they have tables that hold definitions for how to convert the files? Do they use some sort of configuration file to hold file details? Do they use stored procedures as they find this easier to modify?
Many Thanks.
Ells
😎

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246985 More actions · Answer 1

Are you asking for how to automatically deal with structure changes to source files (eg, change in number of fields, max length of fields or data types of fields)?

Or how to be flexible, in the knowledge that these things will change?

The more flexible you want the solution to be, the longer it will take to develop. If your source data isn't changing very often, I would expect it to be uneconomical to try and handle the changes via any method other than updating the package through BIDS.

Not sure quite what sort of answer you are hoping to hear ...

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

Ells SSCrazy Eights Points: 9901 More actions · Answer 2

Phil,

thanks for the reponse. I tried to leave the question as open as possible. I guess I am looking to install some level of robustness, also to make it easy to diagnose issues and correct them.

My first thoughts are to load the flat file into a staging table with default column names (column 0, column 1 etc) and default column sizes (nvarchar(50)). The next stage would be a table with columns for the name of the staging table, name of the column in the staging table, name of the column in the target table , column for if the column is required, column containg the cast required. This linked with a stored procedure.

Just interested to hear some ideas.

Ells.

😎

Alvin Ramard SSC-Forever Points: 41190 More actions · Answer 3

Be careful about trying to develop a package that can handle too many files changes. Such a package might be able to run "successfully" even if the order of the columns get changed by accident. I say "successfully" meaning that it runs without execution errors. Consider what might happen in this case it the package starts loading data into the wrong columns.

I prefer to develop packages that fail if the structure of the source data changes. This way someone will be alerted about the potential problem and you greatly reduce the risk of producing erroneous results.

... just my 3 cents worth ...

Alvin Ramard
Memphis PASS Chapter[/url]

All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

Ells SSCrazy Eights Points: 9901 More actions · Answer 4

Ok,

so if your best practice is to accept that the packages do you do anything in your design to make it easier to resolve once you have a problem.

Sorry about this I know its quite a soul searching question. I may or may not accept that the package failing is good practice as it enforces some checks. However should it fail I want it to be as easy as possible to correct. This is why I am contemplating some stored procedures as I find diagnosing SQL issues easier in management studio rather than in a SSIS package.

Thanks

Ells.

Dave23 Hall of Fame Points: 3903 More actions · Answer 5

I use an error re-direct scheme on all key transforms in a package. In each case, the process directs the offending source data row into an error queue table, along with a description of which transform generated the error row. The error queue is inspected after each run.

I bring all columns in as text, but I do constrain them to the expected input field length. If this fails, then I know that there is some issue with the inbound data.