January 8, 2020 at 1:23 am
Actually, skip all of that. Most of it just doesn't matter. I'll be back soon with a demo from the presentation I give on the subject of "Automating "Excel Hell"".
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2020 at 2:33 am
Ok... Demo time.
The first thing is to unzip and copy the spreadsheets from the attached SpreadSheets.zip file to the directory of your choice. If the directory is something other than C:\JBMTest, you'll need to change the demo code to match.
Also understand that I give this demo on my laptop. That essentially means that the files are "on the database server". I've setup to do this to pull from other machines using UNCs but not for this demo.
I also have to assume that you've installed the ACE drivers. I use an older version of Excel because I think the new versions make some sucking sounds and so I refer to Microsoft.ACE.OLEDB.12.0 and Excel 12.0 in the code. You may have to change those before you run the demonstration.
I wrote this to get some work done for one of the companies I work for and to make my life easier in the future. The spreadsheets from the zip file only have 3 things in common...
The reason why we don't care if people add columns to the data is because I wrote the code to handle that automatically and without having to putz around with code at the end of every month or when some adds another column in a grouping or whatever. You also notice that I also don't care where the top left corner of the active area starts. The ACE drivers actually take care of that for us.
The results are stored in an EAV table. Then columns describing the rows will automatically vary in number according to the common area in the spreadsheets. Truly, PFM.
I've also attached the code for the stored short stored procedure that susses all of this out and a script that demonstrates importing all of the spreadsheets with only a change in the file name. I don't actually created the imports in permanent tables so you don't have to worry about all that. Obviously, you want to ensure that you've created the stored procedure before you run the example usage demo.
Have fun and understand why I say I don't actually care when people add or delete columns or rows. I didn't make it totally "AI" but it does a whole lot auto-magically. A tweak or two could make it handle a whole lot more without have to adjust a package, etc, etc, every time someone sneezes on a spreadsheet. It does use a couple of Global Temp tables but I'm working on that, as well.
And now you know one of the many reasons I say I don't need SSIS for most things. 😀
Ah... almost forgot. If you really feel the urge to validate column names and all, that what one of the reasons why I dump this into a temp table is. I never shoot data into the final resting place during an import. It's just not safe to do so. It's not like it's a Sunkist orange where you know what's inside. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2020 at 2:57 am
p.s. If the OP has a spreadsheet with just one header row, we'll need to do a little work but not much. We'll make it so it's self-healing just like in the demo sheets/code.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2020 at 4:33 am
p.p.s. Almost forgot. Sheet names that are weird or begin with a numeric digit must be encapsulated in single quotes along with the normal square brackets. For example the sheet for 2020 would be named ['2020$'].
There's also some pretty cool stuff you can do if you open the sheet as a linked server or if you use ADODB from SQL server including create new spreadsheets on the fly. Add in a hint of DirTree and maybe xp_CmdShell and it's amazing what you can do with spreadsheets from SQL Server without SSIS. I've gotta save a little of it for a short series on the subject, though. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2020 at 2:15 pm
Jeff, thank you for taking the time to write all of this stuff out.
Importing from Excel using SSIS is almost always a massive pain, so I am going to check this method out.
The biggest problems I have tend to be related to data types, or the lack of them. If what you are doing gets round those issues, I'll be converted.
Thanks again!
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 8, 2020 at 2:18 pm
Importing from Excel using SSIS is almost always a massive pain, so I am going to check this method out.
On this, I couldn't agree more. I honestly despise the "ACE" drivers; they are certainly not an "ace" at anything other than inferring the wrong data types and then errorring when it does. If I can ever direct an external provider to giving us a CSV instead of a xls(x) file, I almost always will.
Will certainly have a look at this later too, Jeff, and appreciate the time you've taken. Might be a little bit before I do though.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
January 8, 2020 at 3:13 pm
This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
Far away is close at hand in the images of elsewhere.
Anon.
January 8, 2020 at 3:21 pm
This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
That's interesting. How does your program access the XLSX data, if not via the awful 'ACE' driver?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 8, 2020 at 3:46 pm
David Burrows wrote:This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
That's interesting. How does your program access the XLSX data, if not via the awful 'ACE' driver?
XLSX is in the OOXML format, not too hard to work with and there are several other spreadsheet programs using the same format. The good thing about OOXML is that one only has to have some understanding of XML to digest the files into any RDBMS, no ACE drivers or other such needed.
😎
January 8, 2020 at 4:04 pm
Thanks, Eirikur, I've had a look into that in the past, now that you mention it.
If anyone has a handy chunk of code, working something like this (with suitable param defaults), I would love to hear from them!
ExtractDataFromXLSX
(SourceFilePath = ,
SheetName = ,
StartRow = ,
EndRow = ,
StartCol = ,
EndCol = ,
ExtractFilePath = ,
ColSeparator = ,
ColQualifier = )
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 8, 2020 at 4:10 pm
Hi Thom, I am trying to understand your question . What do you mean by Validation mechanism?
I declared a variable with some default value 'the value is generated by script task' . This means that I use this variable in the send mail task . I thought that it shows the default value (during run time this will be replaced the file path) which will be useful for other colleagues in future when they want to know how does the file is attached to the send mail task. But Send mail task does not accept providing this comment as a default value but some valid path or no value. I don't know what is happening in here. But for now, I have given a path as default value.
Thanks,
Charmer
January 8, 2020 at 4:16 pm
David Burrows wrote:This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
That's interesting. How does your program access the XLSX data, if not via the awful 'ACE' driver?
Not as clever as you think we had an app server that had Office installed for other reasons, so the program used Excel COM to access the file, it was an easy solution at the time.
Far away is close at hand in the images of elsewhere.
Anon.
January 8, 2020 at 4:22 pm
Thanks, Eirikur, I've had a look into that in the past, now that you mention it.
If anyone has a handy chunk of code, working something like this (with suitable param defaults), I would love to hear from them!
ExtractDataFromXLSX
(SourceFilePath = ,
SheetName = ,
StartRow = ,
EndRow = ,
StartCol = ,
EndCol = ,
ExtractFilePath = ,
ColSeparator = ,
ColQualifier = )
I have a simple framework for this, prepping a talk and some articles on the subject, will keep you in the loop 😉
😎
January 8, 2020 at 4:25 pm
Phil Parkin wrote:David Burrows wrote:This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
That's interesting. How does your program access the XLSX data, if not via the awful 'ACE' driver?
XLSX is in the OOXML format, not too hard to work with and there are several other spreadsheet programs using the same format. The good thing about OOXML is that one only has to have some understanding of XML to digest the files into any RDBMS, no ACE drivers or other such needed.
😎
You will still need to unzip the file first.
Far away is close at hand in the images of elsewhere.
Anon.
January 8, 2020 at 4:36 pm
Eirikur Eiriksson wrote:Phil Parkin wrote:David Burrows wrote:This is why I moved away from SSIS and ACE for spreadsheets with mixed data (ie numeric and alphanumeric) in the same column, unless I can guarantee the data. Even a single column with numeric data with leading zeros will be changed by the drivers.
To get round this, and I will not put office on my SQL server, I wrote a program(run on an application server) to export the first sheet to tab delimited text file without changing the data, then I could import the data by any method I choose.
That's interesting. How does your program access the XLSX data, if not via the awful 'ACE' driver?
XLSX is in the OOXML format, not too hard to work with and there are several other spreadsheet programs using the same format. The good thing about OOXML is that one only has to have some understanding of XML to digest the files into any RDBMS, no ACE drivers or other such needed.
😎
You will still need to unzip the file first.
Unzipping is the simple part, only needs a path to a command line program in the path variable, much simpler than installing a driver. I normally use 7zip, works like a charm every time.
😎
Viewing 15 posts - 16 through 30 (of 58 total)
You must be logged in to reply to this topic. Login to reply