May 12, 2021 at 11:44 pm
Hi,
I am getting CSV files with hundreds of columns many of which have irregular characters in the column names, such as |, " ", -, /, etc. I've created table by manually importing the file, which renamed those columns in the table nicely by removing the character and replacing some of them with underscore. When I try to build a data flow to import the file into the existing table I have to manually map all those hundreds of columns as the names do not exactly match. If I create a new table the junk symbols go into the column names. Is there a way to tell the SSIS designer to map the columns based on the order?
Also my table has nvarchar columns and the connection manager considers the source files to be varchar, so I have to change every column in the connection manager to Unicode string. Is there a way to do it for all columns at once?
Thanks.
May 13, 2021 at 2:26 pm
Unfortunately, the short answer is 'no and no', to the best of my knowledge.
Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 13, 2021 at 8:06 pm
You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import
It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
May 14, 2021 at 2:20 am
Unfortunately, the short answer is 'no and no', to the best of my knowledge.
Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!
This is what I ended up doing (opening the package in notepad and doing find and replace). A bit risky, so I wanted to see if there is a different way or more legit way to do this.
I did not quite understand your suggestion about the Data Conversion component for the columns. I am getting CSV files with hundreds of columns. Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?
May 14, 2021 at 2:50 am
You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import
It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.
I think manually creating the table and entering data into it will take even more time. The length of the columns are not the problem for me. It is the varchar vs nvarchar. My table is nvarchar, but the import tool considers the csv file data as varchar. So I had to open the package file in notepad and do find and replace.
May 14, 2021 at 6:06 am
I did not quite understand your suggestion about the Data Conversion component for the columns. I am getting CSV files with hundreds of columns. Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?
No, I was referring to this: https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/data-conversion-transformation?view=sql-server-ver15
I don't think it will particularly speed things up for you in this case, but I mentioned it in passing just in case it's of interest.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 14, 2021 at 6:12 am
Regarding your question about column names, have you considered creating a text file which contains (only) your desired column names as headings (which match the names in your target table) and then appending the source file to this file before importing?
This would be worthwhile only if the source files have consistent formats (in terms of column order and number of columns).
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 14, 2021 at 4:56 pm
Jeffrey Williams wrote:You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import
It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.
I think manually creating the table and entering data into it will take even more time. The length of the columns are not the problem for me. It is the varchar vs nvarchar. My table is nvarchar, but the import tool considers the csv file data as varchar. So I had to open the package file in notepad and do find and replace.
That is not what that article shows - the basics are simple.
The other way around this kind of issue is to use a staging table - where the columns are defined as varchar. Then a simple truncate/insert procedure that moves the data from the staging table into the final table - where SQL Server will then implicitly convert the data to your nvarchar columns.
With all that said - if the data coming into the system from these files is varchar, then why do you need them defined as nvarchar?
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply