SSIS and dealing with embedded double quotes

  • Is there a way for SSIS to overlook embedded double quotes in the data so when I run my package and the "for each loop" gets executed and imports all the files in a folder into a table the columns wont shift because of the embedded double quotes?

    I've read about the Undouble add-on that you download from the microsoft website but I have not had any luck with the installation so I couldn't test to see if it would work. I also tried using OPENROWSET command but anything after the column with a double quote would be NULL.

    So basically, here's an example from the csv I'm working with:

    "ID","Name","City","Age","Title"

    1,"John "The Man, II" Doe","Gotham",30,"Villian"

    There are no double quotes around 1 and 30 because those are numeric fields when they were extracted into a csv file. The city, age, and title are shift one or two columns over and the name would only be "John".

    Does anyone have any ideas on a work around for this?

  • Unfortunately SSIS in SQL 2005 and 2008 does not support embedded text qualifiers in CSV files.

    Vote on it at Microsoft Connect.

    https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=312164

    One workaround is to read in a single line at a time, then parse each line in a Script component before going to the destination.

    [Source] -> [Script Component] -> [Destination]

  • <retracted>

    J Pratt

  • Yuk, nasty. Easiest solution: change the field delimiter character to something else in the source data.

    Otherwise (and this may not work perfectly), you could consider running a replace routine on the source data before reading it using SSIS. Replace ," with ,| (or whatever your new delimiter is) and replace ", with |, and then use | as your delimiter.

    I said that it may not work perfectly because

    1) There may be some additional spaces in there - but you could code around that, and

    2) Your embedded text could conceivably contain this combination of characters too.

    Phil

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Hello

    i am running into a similar problem with the embedded double quotes.

    I tried using the redirect row on error to just skip those rows with the double quotes.

    I have not gotten it to work. Does anyone know of a way to just skip over that row and keep processing?

  • Other than the ideas put forward by myself and Todd, I know of no way forward for you.

    There is nowhere in the system where you can set a flag: "Skip rows that cannot be parsed" or whatever.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • natarshia.peck (3/8/2011)


    Hello

    i am running into a similar problem with the embedded double quotes.

    I tried using the redirect row on error to just skip those rows with the double quotes.

    I have not gotten it to work. Does anyone know of a way to just skip over that row and keep processing?

    How exactly is this causing troubles?

    For testing purposes, I loaded a flat file with embedded double quotes into SSIS 2008 and it imports just fine.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (3/9/2011)


    natarshia.peck (3/8/2011)


    Hello

    i am running into a similar problem with the embedded double quotes.

    I tried using the redirect row on error to just skip those rows with the double quotes.

    I have not gotten it to work. Does anyone know of a way to just skip over that row and keep processing?

    How exactly is this causing troubles?

    For testing purposes, I loaded a flat file with embedded double quotes into SSIS 2008 and it imports just fine.

    Really - even though you specified " as the text-field delimiter?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (3/9/2011)


    Koen Verbeeck (3/9/2011)


    natarshia.peck (3/8/2011)


    Hello

    i am running into a similar problem with the embedded double quotes.

    I tried using the redirect row on error to just skip those rows with the double quotes.

    I have not gotten it to work. Does anyone know of a way to just skip over that row and keep processing?

    How exactly is this causing troubles?

    For testing purposes, I loaded a flat file with embedded double quotes into SSIS 2008 and it imports just fine.

    Really - even though you specified " as the text-field delimiter?

    Yes. But then I did some more testing, and it starts to screw up when there are double quotes AND commas in one string.

    Conclusion: use another delimiter for god's sake. The comma is one of the most used punctuation marks in strings. Use a pipe | as a delimiter, because who uses that symbol in a normal sentence? 99% of your problems will go away.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • That's a message I always try to get across. The fact that the files are usually called 'Comma Separated' is not helping the cause one bit and I think we have an uphill struggle.

    Good work with the testing by the way. Not got much work to do today? 😀

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (3/9/2011)


    That's a message I always try to get across. The fact that the files are usually called 'Comma Separated' is not helping the cause one bit and I think we have an uphill struggle.

    Good work with the testing by the way. Not got much work to do today? 😀

    Nope. I'm actually between projects for the moment, so I'm enhancing my SSIS skills by performing such little tests.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Impressive commitment! When I am 'between projects' I tend to work on improving my guitar skills:smooooth:

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Well a. i'm a new employee and the file we get are text qualified with "" and comma delimited. so a row looks like this

    "some text","some text2","some name "BookTitle" to buy ", "end"

    when the package sees a " it thinks that its the end of the field, and complains that it can't find the , after the "

    I talked my supervisor about saying it isn't text qualified, but that would mean the "" would show up in our rows.

    I guess based on all the responses we can ask for a different delimiter or for them not to include the "" in the title names, but i need some solution to load these things because a change is probably not going to happen super fast.

    Right now all i can do is open the file and manually remove the lines or fix them.

  • natarshia.peck (3/9/2011)


    Well a. i'm a new employee and the file we get are text qualified with "" and comma delimited. so a row looks like this

    "some text","some text2","some name "BookTitle" to buy ", "end"

    when the package sees a " it thinks that its the end of the field, and complains that it can't find the , after the "

    I talked my supervisor about saying it isn't text qualified, but that would mean the "" would show up in our rows.

    I guess based on all the responses we can ask for a different delimiter or for them not to include the "" in the title names, but i need some solution to load these things because a change is probably not going to happen super fast.

    Right now all i can do is open the file and manually remove the lines or fix them.

    Your idea about importing as not text qualified has some merit. You could bring the data in 'as is' and then run some UPDATE commands afterwards that would tidy most of it up (by removing the first and last character of the offending fields, where field length > 2, perhaps).

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Thanks. I have to move on this fast because I inherited this package and I have other packages to develop/test. So I need to report up what the deal is with the pack failing and some 'resolution'.

    The thing that perplexes me most, and I just had another dev look at this, is why when the parse fails, the redirect on error doesn't just skip the row.

    I guess like the post says, there is nothing in ssis that will do this.

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply