November 28, 2011 at 4:02 pm
Hi,
My SSIS package is behaving very strangely, which to me seems like a potential microsoft bug.
I wanted to see if anyone else was having a similar problem.
Basically I get a raw CSV file that needs to be loaded into the base database table. The CSV file has an extra comma at the end of each row. Since I use "comma" as my delimiter, I get an extra empty column when I create a "Flat File Connection Manager" in SSIS. I leave that column in there.
I then add an OLE DB Destination, and load the flat file source into a database table. The database table DOES NOT have this column, and when I am doing the mapping I am not mapping this column in the source file to any column in the destination.
I then run the SSIS package.
SSIS package runs with no problems. I check the row count and all matches.
I then open the original raw CSV file and remove the extra comma from the end of each row (I open it in textpad and replace ',' with '').
I truncate the base table and then re-run the SSIS package.
Once again the package runs fine, but this time the row count in the table is less than in the raw file. And this is the most interesting part: It is exactly half of the original raw file.
I was able to repeat this multiple times, with different files, and each time I get the same result - at half point SSIS completes the data load - as if there are no more rows in the raw file. It is the strangest thing.
Has anyone else encountered anything like that?
Thanks!
November 28, 2011 at 4:08 pm
sql_er (11/28/2011)
I then open the original raw CSV file and remove the extra comma from the end of each row (I open it in textpad and replace ',' with '').....
Once again the package runs fine, but this time the row count in the table is less than in the raw file. And this is the most interesting part: It is exactly half of the original raw file.
...
Has anyone else encountered anything like that?
Truncated to the important components. Here's what's happening:
a,b,c,d,<CR>
e,f,g,h,<CR>
Easy enough, right?
so your layout is looking for this:
Start of Row-> comma - 1st column
End of last comma -> comma - 2nd col
end of last comma -> comma - 3rd col
end of last comma -> comma - 4th col
end of last comma -> <CR> - 5th col
So, first run, it splits out correctly. Next run, your data looks like this:
a,b,c,d<CR>
e,f,g,h<CR>
So your columns:
a| b| c| d<CR>e|f,g,h
i|j|k|l<CR>m|n,o,p
So something like this is occurring, it's got to do with row and column ending characters.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 29, 2011 at 7:29 am
Evil Kraig F,
Thank you for your reply.
It looks like when I made a post certain special characters got removed from my post which possibly made it misleading.
From my post it looks like I replaced a comma with nothing in my raw file (i.e. ',' with ''), while in reality I removed comma + carriage return with just carriage return [i.e. ',CR' with 'CR'], thereby removing the last comma before the carriage return in every row of the raw file.
Do you have additional comments now?
Thanks!
November 29, 2011 at 8:33 am
Did you update your data source in your SSIS package? If not, it's still expecting a column there, and you're probably getting lost data because of that.
Look at a preview of the source data. You can either do that in the data source object, in a the Data Flow, right-click the arrow from the data source to the data destination, and put a review step in there. Compare what you're seeing in either place with the original data file.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 29, 2011 at 11:12 am
sql_er (11/29/2011)
Do you have additional comments now?
As mentioned above, it's the same problem. , + CR to CR or , to blank and leaving the CR alone leaves the same set of special characters.
You're not finding the fifth column, so it's going to the CR. Try the 'Ragged Right' Format which works around that problem.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply