August 25, 2020 at 5:37 pm
Hi,
I am fairly new to SSDT and am looking for some help with loading excel files that I have been provided for a project. The files will all have the consistent file names and consistent tab names but that is where the consistency ends. The tabs can have from 1 to 150 columns with different names, columns named the same but in a different location and on and on. The problem is there are different iterations of the files submitted by the businesses and can change at any time.
I am looking for way that I can load each file into it's own SQL table using SSDT then once I get the source data loaded I will be able to normalize the data make updates to the metric names and then produce some reports.
If anyone has encountered this before and a solution for doing this that would be great.
Thanks in advance for any help you can give on this matter.
If needed I can provide examples of the excel files.
BWP
August 25, 2020 at 5:52 pm
You mention SSDT – are you hoping to create an SSIS package to do this, or something else?
This is a big task, even for an experienced developer.
SSIS packages are not very dynamic and usually rely on there being a fixed data structure (ie, columns, column names, column order and data types (such as they are in Excel)) in place and unchanged from one import to the next.
I will be interested to hear what others have to say. It is theoretically possible, for example, to create a package which analyses the structure of a given Excel file and then programmatically creates and executes another package at run time to import it. You'd probably have to import everything to text columns and work from there.
By far the better solution to this problem, in my opinion, is to force the data providers to agree to a defined and specific format and to then throw anything back at them which violates this format, to be fixed.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 25, 2020 at 6:11 pm
Phil,
Thank you for the quick response. I wish I had the luxury of locking down the format but since I have inherited this, I am having to play the cards I am handed.
Yes you are correct I am looking to perform the action in a SSIS package.
August 25, 2020 at 7:28 pm
If needed I can provide examples of the excel files.
This can be done but I need to have a look-see... Attaching a couple would be a big help.
Of course, do NOT attach any spreadsheets that have either PII or company-sensitive information, please.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 26, 2020 at 1:08 pm
Jeff, Thank you for the help on this I have attached 3 files with what I have been seeing in the actual data files. You will notice some have all of the columns of data, some have most of the columns, some have new columns, some have renamed columns. It is all across the board as to what the person can send us.
August 26, 2020 at 1:50 pm
Awesome. I have a totally automatic solution for that that will also auto-magically adjust for new columns, deleted columns, new spreadsheets, etc. The key is that it auto-magically maps and unpivots the columns. I have a pretty full dance card so it might take me a couple of days to respond but I believe you'll be amazed.
Just to be sure (this is incredibly important), it looks like the ID column is the "PK" even between spreadsheets and the name stuff comes along for the ride on that... is that correct? The reason I ask is because I need a key to unpivot the data on so that we can successfully create an EAV. Once the data is in such a table, we can do whatever you need to the data including but not limited to repivoting the data back into a common table that has all the columns you want to keep.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 26, 2020 at 1:51 pm
p.s. And, no... this process doesn't use "SSDT", "SSIS", or any of the other 4 letter words in SQL Server. It's all done in T-SQL and it runs very quickly. Are you allowed to use the "ACE" drivers to do the spreadsheet imports with?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 26, 2020 at 2:13 pm
Hi Jeff,
Thanks for all the help on this, on this instance of SQL I do have the ability to use the ADO connection to load the files and yes the ID column is the PK. I look forward to seeing this, I am sure that I will be using this several times in the future.
August 26, 2020 at 2:33 pm
August 26, 2020 at 7:06 pm
Phil thanks for the reminder, I will make sure I get the install completed.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply