October 17, 2018 at 10:18 am
This is best practices question. When importing to a database table from aflat file, it's often necessary to change the data type and width assigned by theconnection manager, say from DT_STR (50) to DT_WSTR (5).
It appears that this can be done in two places:
The Advanced page of the Flat File Connection Manager
In the Advanced Editor of the Flat File Source that uses theconnection manager, on the Input and Output Properties tab: Flat File SourceOutput > Output Columns.
What is the intended use of each of these places where thechange can be made?
Which should be used to make the change according to bestpractices?
Thanks,
Mike
.
October 17, 2018 at 11:56 am
You're looking for the wrong thing. The "Best Practice" is to know what the data from the flat file is and to properly set the target staging table to the correct width. Anything that comes in larger is probably a mistake and the rows should automatically be sequestered as "bad data".
--Jeff Moden
Change is inevitable... Change for the better is not.
October 17, 2018 at 12:53 pm
I don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.
I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...
When I create a flat file connection manager, that column is assigned DT_STR (50).
The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.
In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.
I am still curious about the original question. When there's a need to do so, is it better change the type and width in the connection manager or the data source?
October 17, 2018 at 1:27 pm
palandri - Wednesday, October 17, 2018 12:53 PMI don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...
When I create a flat file connection manager, that column is assigned DT_STR (50).
The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.
In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.
I am still curious about the original question. When there's a need to do so, is it better change the type and width in the connection manager or the data source?
Think of it as a hierarchical structure, with connection manager first, followed by zero or more flat file sources.
You can vary the column properties for a flat file source without changing them in their parent connection manager.
So, in my opinion, changing the connection manager is the way to go, unless you have some unusual requirement where you want to set up multiple data flows using the same connection manager, but with differing column properties.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
October 17, 2018 at 10:51 pm
palandri - Wednesday, October 17, 2018 12:53 PMI don't understand your response, I didn't ask the right question, or I didn't create the flat file connection properly.I do know what the flat file data is, the column in question is a two alpha chars, e.g, ....,KG,...
When I create a flat file connection manager, that column is assigned DT_STR (50).
The target column in the destination table is nvarchar(5), so the input column will come in larger on every row.
In situations like this, I've been changing the type and length in the connection manager to DT_WSTR(5), which works, but I'd like to know the "correct" way to accomplish this.
I am still curious about the original question. When there's a need to do so, is it better change the type and width in the connection manager or the data source?
With apologies, I'll have to say that I can't help with the original question because I don't use SSIS for importing data because of some of the very things you mention (Phil will tell you I'm a bit of an SSIS "troll" :D). I do all my ETL with stored procedures and the occasional bit of help with a trip through a Powershell combination of Import-CSV and Export-CSV to normal junky data produced by spreadsheets or just hit the spreadsheets directly with a trip through the ACE drivers. I also take some good care in right-sizing my target tables (almost always staging tables) to ensure that variable width data causes and error and auto-magic sequestration of unexpected values as a first form of data validation. With that thought in mind, it really doesn't matter what the "DT_WSTR()" size is for delimited files. Of course, for fixed-field formats, it would have to be hardcoded (one way or another) and spot on no matter what tool was used.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 18, 2018 at 8:34 am
Thanks guys.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply