Using new GetIdentificationStringByID to get Column Name in SSIS

  • I'm using SSIS to load tab-delimited text files into a SQL staging db. Naturally, I'm having to deal with truncation issues. I'm using the newly-introduced GetIdentificationStringByID method to get the column name and the default "Flat File Source Error Output Column" that you get from the error output that has the entire row of data. Does anybody know, though, how I can get the actual line number that caused the error?

  • lduvall (8/3/2016)


    I'm using SSIS to load tab-delimited text files into a SQL staging db. Naturally, I'm having to deal with truncation issues. I'm using the newly-introduced GetIdentificationStringByID method to get the column name and the default "Flat File Source Error Output Column" that you get from the error output that has the entire row of data. Does anybody know, though, how I can get the actual line number that caused the error?

    Other than by creating an INT variable within the package and incrementing it manually in your data flow task, I know of no way to do this.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • A while back I did an import system that used a staging table with x number of varchar(xxx) column and a error flow to insert to a table with varchar(max) columns if there was a truncation error, could such thing work in your case?

    😎

  • Eirikur Eiriksson (8/3/2016)


    A while back I did an import system that used a staging table with x number of varchar(xxx) column and a error flow to insert to a table with varchar(max) columns if there was a truncation error, could such thing work in your case?

    😎

    How does this help get the original line number?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (8/3/2016)


    Eirikur Eiriksson (8/3/2016)


    A while back I did an import system that used a staging table with x number of varchar(xxx) column and a error flow to insert to a table with varchar(max) columns if there was a truncation error, could such thing work in your case?

    😎

    How does this help get the original line number?

    It contains the full line content, ergo easy to locate.

    😎

  • Eirikur Eiriksson (8/3/2016)


    Phil Parkin (8/3/2016)


    Eirikur Eiriksson (8/3/2016)


    A while back I did an import system that used a staging table with x number of varchar(xxx) column and a error flow to insert to a table with varchar(max) columns if there was a truncation error, could such thing work in your case?

    😎

    How does this help get the original line number?

    It contains the full line content, ergo easy to locate.

    😎

    Indeed ... did you manage to capture the error rows split into columns, or as a single varchar(max) column (which, as you probably know, is the default for SSIS)?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (8/3/2016)


    Eirikur Eiriksson (8/3/2016)


    Phil Parkin (8/3/2016)


    Eirikur Eiriksson (8/3/2016)


    A while back I did an import system that used a staging table with x number of varchar(xxx) column and a error flow to insert to a table with varchar(max) columns if there was a truncation error, could such thing work in your case?

    😎

    How does this help get the original line number?

    It contains the full line content, ergo easy to locate.

    😎

    Indeed ... did you manage to capture the error rows split into columns, or as a single varchar(max) column (which, as you probably know, is the default for SSIS)?

    Split in a Script transformation so yes, split into columns.

    😎

  • Would you share your code? I'd love any help. What I have now will already return me the entire line that failed but I have not parsed it into the individual columns. I'm still not sure I understand how that will get me the line number, though. I need to know the line number from the raw data file that presents the problem, not just the actual data in that line. I can get that now. I need the line number, and I don't know how to iterate to get that value without reading line-by-line which I don't think is feasible given the size of some of the raw files.

  • lduvall (8/3/2016)


    Would you share your code? I'd love any help. What I have now will already return me the entire line that failed but I have not parsed it into the individual columns. I'm still not sure I understand how that will get me the line number, though. I need to know the line number from the raw data file that presents the problem, not just the actual data in that line. I can get that now. I need the line number, and I don't know how to iterate to get that value without reading line-by-line which I don't think is feasible given the size of some of the raw files.

    I suggested how to do it already. Are you unsure how to create and increment package-scoped variables?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (8/3/2016)


    lduvall (8/3/2016)


    Would you share your code? I'd love any help. What I have now will already return me the entire line that failed but I have not parsed it into the individual columns. I'm still not sure I understand how that will get me the line number, though. I need to know the line number from the raw data file that presents the problem, not just the actual data in that line. I can get that now. I need the line number, and I don't know how to iterate to get that value without reading line-by-line which I don't think is feasible given the size of some of the raw files.

    I suggested how to do it already. Are you unsure how to create and increment package-scoped variables?

    Your suggestion was "Other than by creating an INT variable within the package and incrementing it manually in your data flow task, I know of no way to do this."

    Can you please explain where in the data flow task to create and set a variable that will read the row number from the original file as it gets imported? If the row fails and gets written to an error table, would the row number not be reset?

  • Where are the truncation errors happening?

    Are they at the flat file source, or the SQL Server destination?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (8/3/2016)


    Where are the truncation errors happening?

    Are they at the flat file source, or the SQL Server destination?

    I applied the scrip to the error output from the flat file source and I want to get the original line number causing the error into my error table destination. This is a screenshot of my Data Flow in SSIS.

  • OK, good stuff.

    Create two package-scoped INT variables: RowCount and ErrorRowCount (say). Initial value should be 0 for both.

    Add two rowcount transformations in your data flow: one on the 'good' path and one on the 'error' path. Assign the row count on the good path to RowCount and that on the error path to ErrorRowCount.

    Now you know that, when there is an error, the line number of the error is (RowCount + ErrorRowCount). which you can create as a derived column in the error path & fire into your 'Error Records' destination.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I think I may actually understand! I will give that a shot and check back in. Thanks so much!

  • Phil, I added the @RowCount and @ErrorRowCount variables and set a derived column to add the two together and mapped it to my error table destination, but it's not working. It's saying 0. I thought maybe it is because my first test file had the error record on line 1, but I flipped it and put it on the final line and it's still not capturing anything. Could it be because it processes the errors first and it hasn't assigned the value to the completed row yet? I'm attaching photos of the data flow and the derived column. I have the default for the variables set to 0 so I don't think it's a matter of trying to add NULL + line number.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply