Type Mismatch on Excel Input

  • I'm working in SSIS 2012 and I'm trying to use a query in an Excel Source to bring in only the data I want to use. My query is

    SELECT F1, F2, F3, F4, F5, F6, F7, F8 From [SheetName$] WHERE F1 > '1/1/2014' and F4 > 0

    In this case, F1 is data type date[DT_DATE] and F4 is currency [DT_CY].

    I getting a type mismatch error that seems to be in the date column because is I remove it I am able to preview data. Is it possible to use an Excel file as a source and query by date range?

  • dates in excel are actually numeric. the integer is the days and the decimal is the time.

    If you have access to the data in any format other than Excel you will save yourself a lot of heartache.

  • dan-572483 (3/21/2014)


    I'm working in SSIS 2012 and I'm trying to use a query in an Excel Source to bring in only the data I want to use. My query is

    SELECT F1, F2, F3, F4, F5, F6, F7, F8 From [SheetName$] WHERE F1 > '1/1/2014' and F4 > 0

    In this case, F1 is data type date[DT_DATE] and F4 is currency [DT_CY].

    I getting a type mismatch error that seems to be in the date column because is I remove it I am able to preview data. Is it possible to use an Excel file as a source and query by date range?

    You can save yourself from a lot of pain by just reading in all of the data (maybe keep F4 > 0) and filter out unwanted rows with a conditional split. It will be easier to implement.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The Excel file contains data going back several years and I'm only looking for data entered in the past few weeks. (Honestly it is a personal project I'm using to teach myself SSIS using a spreadsheet of gasoline purchases I've been keeping since the 1990s. I find myself more motivated to practice with data that has personal meaning to me than generic leaning guide projects on AdventureWorks )

    My goal is to learn to design projects efficiantly, which means reading in only data that is needed (the past week or two using a SELECT MAX() variable assigned from the target DB) rather than all 10+ years of data. If I connect the Conditional Split to the Source connector I can do that? I've also found messages about exporting Excel to CSV before importing to SQL, but that seems more complex that it should be.

  • dan-572483 (3/25/2014)


    The Excel file contains data going back several years and I'm only looking for data entered in the past few weeks. (Honestly it is a personal project I'm using to teach myself SSIS using a spreadsheet of gasoline purchases I've been keeping since the 1990s. I find myself more motivated to practice with data that has personal meaning to me than generic leaning guide projects on AdventureWorks )

    My goal is to learn to design projects efficiantly, which means reading in only data that is needed (the past week or two using a SELECT MAX() variable assigned from the target DB) rather than all 10+ years of data. If I connect the Conditional Split to the Source connector I can do that? I've also found messages about exporting Excel to CSV before importing to SQL, but that seems more complex that it should be.

    If you connect the conditional split to the source, you can filter out rows but only right after they were read.

    So you are still reading everything in. In my opinion, that's OK. Some types of input are not easy to filter (such as flat files) and it might be easier to just read everything. Excel files can contain only so much rows - depending on the version - so you have to ask yourself if it is worth investing a lot of development time in gaining just a few seconds of performance.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • dan-572483 (3/25/2014)


    The Excel file contains data going back several years and I'm only looking for data entered in the past few weeks. (Honestly it is a personal project I'm using to teach myself SSIS using a spreadsheet of gasoline purchases I've been keeping since the 1990s. I find myself more motivated to practice with data that has personal meaning to me than generic leaning guide projects on AdventureWorks )

    My goal is to learn to design projects efficiantly, which means reading in only data that is needed (the past week or two using a SELECT MAX() variable assigned from the target DB) rather than all 10+ years of data. If I connect the Conditional Split to the Source connector I can do that? I've also found messages about exporting Excel to CSV before importing to SQL, but that seems more complex that it should be.

    I haven't done exactly what you are doing, but I do have some comments.

    1) As Aaron has already pointed out, Excel doesn't do data-typing. Regardless of how Excel displays data, it is either text or numeric. Dates (I think) are stored as the number of days since 1/1/1900 (or maybe 1/1/1899). Decimals represent the time component. So I would try WHERE XLDate > nnnn (you'll have to work out what this number is).

    2) If you want to be efficient, don't use Excel files as data sources. Use CSV files instead.

    3) I understand your wanting to filter your source data, but using a conditional split to direct unwanted rows to oblivion is fast - you should try it. Remember, once again, that Excel is not a database engine, so queries against it are not likely to be optimised.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (3/26/2014)


    1) As Aaron has already pointed out, Excel doesn't do data-typing. Regardless of how Excel displays data, it is either text or numeric. Dates (I think) are stored as the number of days since 1/1/1900 (or maybe 1/1/1899). Decimals represent the time component. So I would try WHERE XLDate > nnnn (you'll have to work out what this number is).

    I believe it is 1/1/1900, but I think they forgot the leap day in 1900, so the count is one day off. Just to show how complicated it can get with Excel. It's just easier to read everything as-is and deal with it at a later point.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I have noticed that Excel stores dates in a numberic format. For example if you enter 3/26/2014 in Excel and then format it for numeric you get 41724. I suppose you could covert the MAX(date) value from the target DB to the numeric value Excel uses (if more recent dates translate to greater numberic values).

    Again, for this project it doesn't matter if I read all the data, but I'm training myself to use best practices, and neither reading 10 years of data to get records from the past week nor converting 10 years to CSV sounds "best" to me.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply