SSIS is Truncating data but NOT telling me, how do I get it to tell me?

  • I am running a DataFlow task with an OLE DB Source of a VIEW and the Destination a Table. The column in the VIEW is varchar(50), while the Table's is varchar(30). SSIS is running the task, inserting rows to the table and truncating the column data without telling me. I want the DataFlow to FAIL when actual data truncation occurs so I can report these.

    In the past SSIS would warn me about the length and possible truncation, but it would run with success (green) when there was no data truncation and fail (red) if any of the data was too large for the Destination column. Why did it change it's behavior?

    I'm a relative novice in SSIS. But I've been doing this same type of thing for about 18 months. I did notice in the past I was probably running SQL 2005 SSIS, while now I am running 2008 R2.

    My hope was that I could change a property to tell SSIS to fail when an instance of truncation occurs, instead of running "successfully" and silently truncating. I searched SQLServerCentral and other sites but did not see what I was after.

    I did notice my DataFlow’s Component Properties in the Advanced Editor showed “OpenRowset Using FastLoad”. If I change that to be “OpenRowset” the task does FAIL on an actual data truncation, however, it does not use the bulk load and runs very slowly.

    I don’t want to query each mapped column for possible truncation because I have over a thousand columns and the real data truncation is somewhat rare. A number of tables are involved and this really impacts a dozen or more DataFlow tasks. I want SQL Server to tell me data IS TRUNCATING not that it MIGHT truncate due to the column sizes.

    Is there a way to use continue to get the benefit of the faster bulk load but also get the failure when actual truncation occurs?

  • smorgan-607537 (7/17/2012)


    I did notice my DataFlow’s Component Properties in the Advanced Editor showed “OpenRowset Using FastLoad”. If I change that to be “OpenRowset” the task does FAIL on an actual data truncation, however, it does not use the bulk load and runs very slowly.

    I don’t want to query each mapped column for possible truncation because I have over a thousand columns and the real data truncation is somewhat rare. A number of tables are involved and this really impacts a dozen or more DataFlow tasks. I want SQL Server to tell me data IS TRUNCATING not that it MIGHT truncate due to the column sizes.

    Is there a way to use continue to get the benefit of the faster bulk load but also get the failure when actual truncation occurs?

    The only way I know to continue to use fast-load yet get truncation failures is to include a Derived column component in the load, changing the datatype of the column to a smaller size and then adjusting the error components to fail on truncation.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • All of that is certainly nice but it reminds me of throwing spaghetti at the proverbial wall to see what sticks. What's wrong with prevalidating the data , instead? It just wouldn't take that long to write a tiny bit of code to check for items over 30 characters BEFORE you tried the transfer. Certainly it would save on the rollback that will occur when the job fails.

    [EDIT] I'll also say that this will continue to be a problem. Is there a reason you can't modify the target column to be 50 characters in length???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, Jeff. There are a few reasons for what I am doing. Time - that is my time on this vs. other project deadlines. Knowledge - I want to know how SSIS does this and what settings are available (I don't understand why the bulk insert behaves differently in this aspect from the non-bulk). Disposability - this is a one-off process, not a reusable package (it is a one-time customer load of their data into our database; it will run one time in production and never be used again). Size - I have around 1400 columns of data with this issue (I'd prefer to change one Property to coding a select for each of 1400 columns, each with different lengths, not all varchar(30) ).

  • smorgan-607537 (7/18/2012)


    Thanks, Jeff. There are a few reasons for what I am doing. Time - that is my time on this vs. other project deadlines. Knowledge - I want to know how SSIS does this and what settings are available (I don't understand why the bulk insert behaves differently in this aspect from the non-bulk). Disposability - this is a one-off process, not a reusable package (it is a one-time customer load of their data into our database; it will run one time in production and never be used again). Size - I have around 1400 columns of data with this issue (I'd prefer to change one Property to coding a select for each of 1400 columns, each with different lengths, not all varchar(30) ).

    Ah, one off? Dump to oversized staging table and test it there with a quick LEN(field) > 30, then pump in all your non-failure records and review the failures. One offs should never go through that much work. Get it to where you can work with it and then do it the easy way.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • smorgan-607537 (7/18/2012)


    Thanks, Jeff. There are a few reasons for what I am doing. Time - that is my time on this vs. other project deadlines. Knowledge - I want to know how SSIS does this and what settings are available (I don't understand why the bulk insert behaves differently in this aspect from the non-bulk). Disposability - this is a one-off process, not a reusable package (it is a one-time customer load of their data into our database; it will run one time in production and never be used again). Size - I have around 1400 columns of data with this issue (I'd prefer to change one Property to coding a select for each of 1400 columns, each with different lengths, not all varchar(30) ).

    I don't know what your data file looks like but it sounds like a Bulk Insert with an output to an errata file to capture the oversized lines would do the trick. Almost no coding. Can't help you on the SSIS side of the house because I don't use SSIS for such things... especially one-offs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply