February 22, 2011 at 12:12 pm
I am trying to create an SSIS package (2008) and I am using the Excel Source. The Excel file (2003) has three charts and some other text before it gets to the data I need. If I import the Excel as it is, some of the columns that should have data import as nulls. If I modify the Excel file and remove the charts and about 50 rows (information not needed), then it imports the columns as expected.
Any idea of what is going on? I know that I can write VBA code to delete the unnecessary rows and the charts, but I want to be able one process for what I need. :unsure:
February 23, 2011 at 12:58 am
The idea of importing an Excel file is to look at it as it is a database.
The sheets are the tables of that database. And of course, the tables contain the actual data.
The Microsoft Jet Provider will read those "tables" with some sort of SQL statement (you can write ordinary SQL statements to your Excel file just as it is a SQL Server database, just with less built-in functions). In this process, the Jet Provider will "guess" the datatypes of the columns. This is done by scanning the rows to determine the datatype. If your connectionstring contains the property IMEX=1, the JET provider will take the string datatype if multiple datatypes are found in one column. If you don't have IMEX=1, JET will take one datatype of the column and it will set the other values as NULL, what has happened with you.
Example: a column contains both text and numbers
IMEX=1: string datatype is chosen
IMEX is missing: whatever comes first is chosen (I will explain this later). If integer is chosen, all text fields will be NULL.
By default, the JET provider determines the datatype by scanning the first 8 rows (while skipping the header). This number is determined by the registry setting TypeGuessRows. If you set this to 0 (recommended), JET will scan the entire column to determine the datatype. Why is this important? Say that the first 8 rows of a column all have numbers, while later on there are rows with text. JET will take the integer datatype, so again you will have NULL values, even if IMEX=1 is in the connectionstring.
So, to conclude:
1. Get rid of the charts and the unnecessary text. They do not belong in an Excel file used for import.
2. Set IMEX=1 in the connectionstring (take a look at www.connectionstrings.com)
3. Set the TypeGuessRows registry setting for the JET provider to 0.
4. While you are at it, write a decent SQL query in the Excel source instead of just selecting the sheet.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
February 23, 2011 at 11:40 pm
Just a thought, and I'm not sure that it will work, but what about importing "Named Ranges" from the Excel sheet, will sort out the problem if you could import the named range...
February 23, 2011 at 11:53 pm
chris.stuart (2/23/2011)
Just a thought, and I'm not sure that it will work, but what about importing "Named Ranges" from the Excel sheet, will sort out the problem if you could import the named range...
Is that possible with the SSIS Excel source using the JET provider?
Or are you suggesting a script task?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
February 24, 2011 at 12:05 am
Koen, I'm thinking Jet, really not sure if you could, but previously have worked a lot with Excel & Excel VBA, and Named Ranged solved a lot of these types of problems, would be nice if it could work..
February 24, 2011 at 12:38 am
chris.stuart (2/24/2011)
Koen, I'm thinking Jet, really not sure if you could, but previously have worked a lot with Excel & Excel VBA, and Named Ranged solved a lot of these types of problems, would be nice if it could work..
According to MS support, it seems possible.
http://support.microsoft.com/kb/321686
Quoting directly:
You can use the SQL Server Data Transformation Services (DTS) Import Wizard or the SQL Server Import and Export Wizard to import Excel data into SQL Server tables. When you are stepping through the wizard and selecting the Excel source tables, remember that Excel object names that are appended with a dollar sign ($) represent worksheets (for example, Sheet1$), and that plain object names without the dollar sign represent Excel named ranges.
I should test it sometime, and if it works, that would improve matters a lot!
Thanks for the suggestion!
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
February 24, 2011 at 12:47 am
Well, well, well.
I thought it might be nice, but if that is direct from M$ then it can be done.
Next time Im importing a Excel book, I'm going to give it a try, but if you do get it right, please let me know.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply