seeing failed rows from bulk insert

  • I have a text file containing millions of rows in a CSV-like format. Ordinarily I'd do a bulk insert but about 1% of the rows are bad (because they violate foreign key constraints). Is there any way to do a bulk insert that only inserts the good rows, and somehow logs the bad rows, so I know how to fix the file? Or do I need to go through the file line-by-line which will take forever?

  • I would look at SSIS. I am by no mean a SSIS guru (novice more like it) but you should be able to use you file as a source with your table as destination and redirect errors to file or table.

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Have you considered importing the data into a "staging table" , then using one or more T-SQL statements to locate the rows where there is data that "violates the constraint(s)". This could

    1. Find the problem rows.

    2. Allow you to "correct" the faulty data before further process into the final table.

    or

    3. Insert the faulty rows into a second table for further analysis

    4. Delete the faulty 1% of the data from the staging table so that the majority of the data can be further processed in a timely fashion.

    This will allow you to continue your already tested bulk import code.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Ron's recommendation is probably your best method. However, if you want to do this in pure SSIS, you're going to be looking at using the MERGE JOIN structure to join in your existing table to find used IDs, another MERGE JOIN from the AGGREGATE to make sure it's not duplicated in your actually loading data, and then using the CONDITIONAL SPLIT based on those results.

    Massive headache, I recommend the staging table.

    Edit: Oh, yeah, use the multi-cast to feed off a copy of the data to your aggregation test.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply