May 24, 2011 at 11:57 am
I need to use SSIS to automatically import a group of CSV files. The problem is that not all of the files are consistent. Please let me show you an example.
CSV FILE #1's Columns
FISCAL_WEEK_NUM
MANUFACTURER
MANUFACTURER_PART_NO
SKU_NUM
SKU_DESC
SHIP_QTY
TOTAL_FPC_DOLLARS
SHIP_TO_CITY
SHIP_TO_STATE
SHIP_TO_ZIPCODE
SEGMENT_DESC
SUB_CHANNEL_DESC
OPTION_CLASS
SHIP_DATE_ACTUAL_DATE
CSV FILE #2's Columns
FISCAL WEEK
MANUFACTURER
MANUFACTURER PART NUMBER
SKU NUMBER
SKU DESCRIPTION
SHIPPING QUANTITY
SHIPPING DOLLARS?
SHIP TO CITY
SHIP TO STATE
SHIP TO ZIP CODE
SEGMENT DESCRIPTION
CHANNEL DESCRIPTION
OPTION CLASS
ORDER NUMBER
CUSTOMER NUMBER
BILL TO COMPANY
LINK NUMBER
LOCAL CHANNEL
INVOICE DATE
SUB CLASS
SHIP TO COMPANY
As you can see, there are some common columns, but not all. So I modified the destination SQL Server 2008 table to contain columns from both source files. However, I noticed that the Flat File Connection Manager assumes the columns will be "set in stone." Is there a way around this? I don't think I'm allowed to go back to the organization that provides the source files and tell them to fix the problem.
May 24, 2011 at 12:39 pm
If you can distinguish between the files by their name or something obvious. Create parallel datastreams that join in a union.
File1Source >> DerivedColumn to add missing columns >> Union
File2Source >> Union
Then move everything through all other needed transforms and to your destination.
May 24, 2011 at 1:19 pm
This is a good idea, except there will be many files in one format and many files in another format. I might be able to figure out what format based on the file name (newer files tend to be in the second format) but I can't hard code the file name into the connection.
May 24, 2011 at 2:50 pm
I wasn't suggesting that you hard code the file name, merely that their might be file naming conventions that you can exploit. If all of one type of file have a naming convention that is different from the naming convention of the other type of file you should be able to define one loop to look for the first file format, and the other to look for the second file format.
May 24, 2011 at 7:16 pm
Here is what I ended up doing. I had the Foreach Loop put the file name in a variable. Then I had a script do the basically the follwoing:
Dts.Variables("intFormat").Value = 0
If lintFileDate < 200934 Then
Dts.Variables("intFormat").Value = 1
ElseIf lintFileDate >= 200934 And lintFileDate < 201101 Then
Dts.Variables("intFormat").Value = 2
ElseIf lintFileDate >= 201101 And lintFileDate < 201210 Then
Dts.Variables("intFormat").Value = 3
ElseIf lintFileDate >= 201210 Then
Dts.Variables("intFormat").Value = 4
End If
And then I had precidence constraints for each of the four intFormat variable values. This way, I can branch out and deal with each of the formats based on the file names.
May 24, 2011 at 7:16 pm
Oh, and you were absolutely right, there were naming conventions for the file names. Thank you so much!
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply