How to capture "bad" record from import flat file?

  • Hi,

    SQL 2005 SP2 on Windows 2003. Tried to import a flat file with 9 records; half of them did not make into staging. Two questions I have:

    1) I could not find any different between "good" and "bad" - there is no missing column, wrong data type, column delimiters look good.... Is there way to find out what went wrong with those "bad" one?

    2) I set source error output for redirect (from flat file) and added Flat File Source Error Output. Although error file was created, only header (Flat File Source Error Output |ErrorCode|Error Column|Filename) without any detail record. Is there way to write bad one out?

    Thanks,

    Chris

  • Did you consider using BULK INSERT with an ERRORFILE specified as an option? The additional control file should help to identify the root cause.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hi Lutz,

    Thanks. My guess I don't have that option. It is production SSIS and was told to fix the issue.

    Chris

  • Even if you can't use it in production... It might help you to figure out why the data are cosidered "bad". You could use this information to modify the SSIS package. Just a thought...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Put a data viewer after your source to check that the 'bad' records make it into the pipeline OK. Look for 'special' characters - commas, quotes, empty fields/NULLs, tabs, CRs.

    Maybe you could post an example of a good record and bad record here and let others suggest possibilities ...

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I agree with Phill. Post your good and bad rows from the data Viewer here. so, some one can lead you in a right direction...

  • Lutz,

    Since # of column in file does not match with # of column in staging, BULK INSERT failed.

    Phil/Divyanth,

    Here are first 4 columns (20 actual) of row in flat file:

    USER|STATUS|CLASSID|CLASSNAME

    CGDXERA |1|WC |CA ACCOUNT INFORMATION

    DGGSGAS |1|WS |XA INFORMATION

    It has header and pipe column delimiter. It skip every other row. I added data view at Flar File Source Error Output; but it did not show any row.

    Chris

  • Chris can you at least identify the missing rows from your flat file and post them.

  • Sounds like your row delimiter may be set up incorrectly.

    What appears at the end of every row of your source data?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (5/5/2010)


    Sounds like your row delimiter may be set up incorrectly.

    What appears at the end of every row of your source data?

    That's right Phil. I completely forgot about it, i had a same situation few weeks back..

    Chris, Try this

    Open it with excel and then delete the last two empty columns and then add a value something Like NULL to all the rows without any header after your last column..

    Let me know If that works

  • You can hook the fail arrow of the destination to a text file or a table and set the failure to redirect row. This will allow the good rows to go into the database and allow you to look at the bad ones. You can even add a row counter to the redirect arrow and test the resulting variable, which would allow you to send an email with the data.

    You can't use fast-load if you set this up. It's also not a good idea if all the records must go in. I use this technique with my data warehouse uploads because I don't want single records holding up the nightly load, but I want to know what those records are the next day so that they can be actioned appropriately.

  • Phil/Divyanth,

    Yes, delimiter was the issue. The package has:

    Header row delimiter: {LF}.

    Row delmiter: {CR}{LF}.

    Column delimiter: {|}.

    I was able to upload all rows by adding extra {|} at the end of each rows. I modified package to accomodate new different formats (users refused to correct the format): extra {|} at the end, space between column, one extra space before the date column,.....

    Thing bothers me is that there is no different when I compared this file (missing row one) with many other files which didn't any issue. At least, I could not tell why it works on others but not this one.

    In any case, thanks for the help.

    Chris

  • I would suggest that you look at the source file in a hex editor and compare what you see with another file which works. There will be a difference, I am fairly certain - you just can't see it when you use 'standard' editors.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply