December 8, 2010 at 3:12 pm
I have a text file containing millions of rows in a CSV-like format. Ordinarily I'd do a bulk insert but about 1% of the rows are bad (because they violate foreign key constraints). Is there any way to do a bulk insert that only inserts the good rows, and somehow logs the bad rows, so I know how to fix the file? Or do I need to go through the file line-by-line which will take forever?
December 8, 2010 at 3:21 pm
I would look at SSIS. I am by no mean a SSIS guru (novice more like it) but you should be able to use you file as a source with your table as destination and redirect errors to file or table.
December 8, 2010 at 3:42 pm
Have you considered importing the data into a "staging table" , then using one or more T-SQL statements to locate the rows where there is data that "violates the constraint(s)". This could
1. Find the problem rows.
2. Allow you to "correct" the faulty data before further process into the final table.
or
3. Insert the faulty rows into a second table for further analysis
4. Delete the faulty 1% of the data from the staging table so that the majority of the data can be further processed in a timely fashion.
This will allow you to continue your already tested bulk import code.
December 8, 2010 at 3:53 pm
Ron's recommendation is probably your best method. However, if you want to do this in pure SSIS, you're going to be looking at using the MERGE JOIN structure to join in your existing table to find used IDs, another MERGE JOIN from the AGGREGATE to make sure it's not duplicated in your actually loading data, and then using the CONDITIONAL SPLIT based on those results.
Massive headache, I recommend the staging table.
Edit: Oh, yeah, use the multi-cast to feed off a copy of the data to your aggregation test.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply