Importing Excel file where text length changes

  • I am trying to import an Excel file with a comment column in it. Most of the time the comments are large and SSIS makes the external column a unicode text stream - DT_NTEXT - and that works fine.

    However, if I run it with an Excel file where the largest column is say maybe 100 characters, it gives me an error and wants to change it to DT_WSTR. That will then, of course, cause problems when I later run it with another file that does have the large comments.

    I can understand why it would complain if I had a column listed as a type that could not hold the string, but I really don't understand why it wants to 'down-size' the column just because a particular spreadsheet doesn't have a large amount of data in the column.

    I run into similar problems if a column is empty for a particular spreadsheet.

    Is there anyway to avoid this so it will use the definition of the column even if it's larger than what it needs for that particular spreadsheet?

    I think I read somewhere that there's a way to do it with a registry hack, but I'd like to avoid that.

    I tried setting ValidateExternalMetadata to false, but that just seemed to delay the problem rather than avoid it altogether.

    I might give up and save the spreadsheets as tab-delimited text files because I don't seem to have the same issues with that.

  • marcia.j.wilson (12/17/2014)


    I am trying to import an Excel file with a comment column in it. Most of the time the comments are large and SSIS makes the external column a unicode text stream - DT_NTEXT - and that works fine.

    However, if I run it with an Excel file where the largest column is say maybe 100 characters, it gives me an error and wants to change it to DT_WSTR. That will then, of course, cause problems when I later run it with another file that does have the large comments.

    I can understand why it would complain if I had a column listed as a type that could not hold the string, but I really don't understand why it wants to 'down-size' the column just because a particular spreadsheet doesn't have a large amount of data in the column.

    I run into similar problems if a column is empty for a particular spreadsheet.

    Is there anyway to avoid this so it will use the definition of the column even if it's larger than what it needs for that particular spreadsheet?

    I think I read somewhere that there's a way to do it with a registry hack, but I'd like to avoid that.

    I tried setting ValidateExternalMetadata to false, but that just seemed to delay the problem rather than avoid it altogether.

    I might give up and save the spreadsheets as tab-delimited text files because I don't seem to have the same issues with that.

    If you have the option, I do recommend using delimited text files as you'll have far less trouble with them. However, I would not use a tab as the delimiter, as this character is also likely to be embedded in a comment somewhere. Choose something less likely – a pipe (|) perhaps.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Since you don't want to change the registry setting (it is not exactly a hack), you can add a first row with dummy data that forces Excel to take the data types you want.

    When you read the data, you simply discard the first row.

    I'm not 100% sure, but can't you also force the data type of the column in the advanced editor of the Excel source component?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Phil Parkin (12/18/2014)


    If you have the option, I do recommend using delimited text files as you'll have far less trouble with them. However, I would not use a tab as the delimiter, as this character is also likely to be embedded in a comment somewhere. Choose something less likely – a pipe (|) perhaps.

    We're doing some screen-scraping of an application that allows us to output to an Excel spreadsheet. So, I figured I would have the screen-scraping do a Save As and output as text. I don't know of a way in Excel to output to a pipe-delimited file.

    That is the character I tend to use when directly building delimited files.

  • Koen Verbeeck (12/18/2014)


    Since you don't want to change the registry setting (it is not exactly a hack)...

    You're right. I shouldn't have used that terminology. Part of my concern with changing the registry is that if it somehow gets reset down the road (maybe through a re-install or an upgrade), the application will start acting up again and no one will know why.

    I'm not 100% sure, but can't you also force the data type of the column in the advanced editor of the Excel source component?

    I don't think you can force the external column definition. However, I am far from being an expert on this, so I might have overlooked something.

  • marcia.j.wilson (12/18/2014)


    I don't think you can force the external column definition. However, I am far from being an expert on this, so I might have overlooked something.

    You cannot change the external columns, but you can change the output columns data type.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (12/18/2014)


    marcia.j.wilson (12/18/2014)


    I don't think you can force the external column definition. However, I am far from being an expert on this, so I might have overlooked something.

    You cannot change the external columns, but you can change the output columns data type.

    Correct. The external columns is what I'm having problems with.

  • Sorry, it took me so long to get back on this, especially after the helpful comments. I checked with the people that were doing the screen-scraping and found that they could add a row into the Excel sheet that was created. So, I am going with adding a 'dummy' line that will have a value with the max length possible for each column, either all x's if text or something like 1234567.12 if numeric.

    After other trials and errors, I have got that working.

    I do wish SSIS would give more helpful error messages. For instance, I got the error

    SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E21.

    An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80040E21 Description: "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done.".

    That was because I had one column (out of over 100) that needed converted. It would have been really nice to get a hint as to which column was correct.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply