January 6, 2010 at 3:14 pm
Here are the details:
- Source is a 5 GB text file. It is in a "ragged right" format with NO character delimiters.
- Destination is 20 tables.
What is the most efficient approach to importing this large text file? Should I attempt a bulk insert task? Will that work considering that I'm going from one source file to 20 destination tables? If I use the bulk insert task, how do I construct the format file?
January 6, 2010 at 9:25 pm
You could dump it to a work table. Then do any parsing or reformatting and send it to the final tables.
Since it is ragged right, I assume this is a fixed-width file. One thing to watch out for: If a row ends before the start of the last field, the next row will be appended to the end of it. I had that situation at work and defined the file as containing one long field. Then derived the individual fields. It was reasonably fast.
January 6, 2010 at 9:46 pm
You say it has no delimiters... but does each row end with something like a carriage return or line feed?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2010 at 3:07 pm
Yes, I believe each line does end with a CR LF.
January 7, 2010 at 3:19 pm
If it's a fixed field (no delimiters, each column a given width), then I just import into a single column flat table and do a simple substring split. It's one of the easiest and fastest methods.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2010 at 3:23 pm
Here's the problem: we are doing a proof of concept comparing SSIS and Informatica. In order for there to be a "fair fight," I can't do any interim steps in any type of SQL Server table.
Is there any kind of interim holding place within SSIS, instead of using a table?
January 7, 2010 at 3:24 pm
Jeff Moden (1/7/2010)
If it's a fixed field (no delimiters, each column a given width), then I just import into a single column flat table and do a simple substring split. It's one of the easiest and fastest methods.
i typically do the same as Jeff suggested; only difference is i create a view that parses out the substrings into the columns; also changing datatypes with convert, and joining it to my biz tables so i can insert complete with necessary keys.
that makes it a little more reusable for me in my imports, since the files are the same week after week, but with different data.
Lowell
January 7, 2010 at 3:29 pm
imani_technology (1/7/2010)
Here's the problem: we are doing a proof of concept comparing SSIS and Informatica. In order for there to be a "fair fight," I can't do any interim steps in any type of SQL Server table.Is there any kind of interim holding place within SSIS, instead of using a table?
You should include the method I stated as one of the contenders instead of just a feeder. And, no, it won't be a fair fight.... 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2010 at 3:35 pm
Okay, but what if we weren't using a SQL Server database? What if we were using SSIS to import a text file into Vertica, for example? Or Sybase IQ? My boss and clients want a data-migration solution that could potentially be SQL Server database engine independent.
January 7, 2010 at 6:51 pm
imani_technology (1/7/2010)
Okay, but what if we weren't using a SQL Server database? What if we were using SSIS to import a text file into Vertica, for example? Or Sybase IQ? My boss and clients want a data-migration solution that could potentially be SQL Server database engine independent.
In truth, there's no such thing. At the very least, you'll need a different connection string. Something somewhere will need to change and if not at the connection string level, then at the API and driver levels so an import actually stands a chance of finishing sometime this year.
There are some dandy file readers out there that are already developed and tested. Instead of trying to reinvent the wheel in SSIS (it's not a panacea anymore than Business Objects is), why not just spend a couple of bucks on a shrink wrapped solution?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply