Best place to change column width and type?

  • This is best practices question.  When importing to a database table from aflat file, it's often necessary to change the data type and width assigned by theconnection manager, say from DT_STR (50) to DT_WSTR (5).
    It appears that this can be done in two places:

    The Advanced page of the Flat File Connection Manager

    In the Advanced Editor of the Flat File Source that uses theconnection manager, on the Input and Output Properties tab: Flat File SourceOutput > Output Columns.

    What is the intended use of each of these places where thechange can be made?

    Which should be used to make the change according to bestpractices?

    Thanks,

    Mike
    .

  • You're looking for the wrong thing.  The "Best Practice" is to know what the data from the flat file is and to properly set the target staging table to the correct width.  Anything that comes in larger is probably a mistake and the rows should automatically be sequestered as "bad data".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.

    I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...

    When I create a flat file connection manager, that column is assigned DT_STR (50).  

    The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.

    In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.

    I am still curious about the original question.  When there's a need to do so, is it better change the type and width in the connection manager or the data source?

  • palandri - Wednesday, October 17, 2018 12:53 PM

    I don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.

    I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...

    When I create a flat file connection manager, that column is assigned DT_STR (50).  

    The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.

    In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.

    I am still curious about the original question.  When there's a need to do so, is it better change the type and width in the connection manager or the data source?

    Think of it as a hierarchical structure, with connection manager first, followed by zero or more flat file sources.

    You can vary the column properties for a flat file source without changing them in their parent connection manager.

    So, in my opinion, changing the connection manager is the way to go, unless you have some unusual requirement where you want to set up multiple data flows using the same connection manager, but with differing column properties.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • palandri - Wednesday, October 17, 2018 12:53 PM

    I don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.

    I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...

    When I create a flat file connection manager, that column is assigned DT_STR (50).  

    The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.

    In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.

    I am still curious about the original question.  When there's a need to do so, is it better change the type and width in the connection manager or the data source?

    With apologies, I'll have to say that I can't help with the original question because I don't use SSIS for importing data because of some of the very things you mention (Phil will tell you I'm a bit of an SSIS "troll" :D).  I do all my ETL with stored procedures and the occasional bit of help with a trip through a Powershell combination of Import-CSV and Export-CSV to normal junky data produced by spreadsheets or just hit the spreadsheets directly with a trip through the ACE drivers.  I also take some good care in right-sizing my target tables (almost always staging tables) to ensure that variable width data causes and error and auto-magic sequestration of unexpected values as a first form of data validation.  With that thought in mind, it really doesn't matter what the "DT_WSTR()" size is for delimited files.  Of course, for fixed-field formats, it would have to be hardcoded (one way or another) and spot on no matter what tool was used.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks guys.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply