October 24, 2019 at 7:27 pm
Ok, guys and gals, complete noob here trying to use SSIS.
Here's the scenario; I have 4 databases that are being consolidated into a distributed database. Let's call these Database1, Database2, Database3, Database4 and DistDatabase. The non-distributed databases are frequent schema changes including stored procedure changes. My question(s) are:
October 24, 2019 at 7:31 pm
Is SSIS able to handle multiple databases in a package?
No problem.
Is SSIS able to handle dynamic schema changes (such as creating a SQL Agent Job on a schedule and running the SSIS package)?
Please clarify. Creating an Agent job does not qualify as a schema change. SSIS does not deal well with database schema changes.
Email if there's a failure such as a stored procedure changed which requires manual fixing?
Sending an e-mail on error (any sort of error) is easy enough to set up.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 7:35 pm
The databases have some frequent schema changes such as adding a new schema, or modifying a stored procedure, function or trigger. I forgot to mention the databases also use custom Assemblies. I'd like to schedule a job to run every so often, that job would be to run the SSIS package(s).
My boss is wanting us to write out our own application to do the ETL and well, I'd rather not go down that path if we don't have to. I'm hoping that SSIS can be used.
October 24, 2019 at 7:50 pm
The databases have some frequent schema changes such as adding a new schema, or modifying a stored procedure, function or trigger. I forgot to mention the databases also use custom Assemblies. I'd like to schedule a job to run every so often, that job would be to run the SSIS package(s).
My boss is wanting us to write out our own application to do the ETL and well, I'd rather not go down that path if we don't have to. I'm hoping that SSIS can be used.
If your database changes happen to objects which are not referenced by SSIS packages, there should be no issue with these frequent changes.
But if your SSIS package is referencing something which has changed, there is a good chance that you will have to tweak the package before it will work. This, however, depends on the nature of the change. Not all changes will create issues. I would expect this to be true of an in-house system too.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 7:53 pm
Thank you. So for instance, if I have an SSIS package that is looking at Table1 which has 5 columns in it. Then, when the package runs via scheduling agent, will it detect if a 6th column has been added and then add that new column to the target?
-Trent
October 24, 2019 at 8:04 pm
Thank you. So for instance, if I have an SSIS package that is looking at Table1 which has 5 columns in it. Then, when the package runs via scheduling agent, will it detect if a 6th column has been added and then add that new column to the target?
-Trent
No.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 8:07 pm
Hmm, I suppose that would be a situation where an email should be sent out from SSIS. However, if there's been a new schema added, is that something that SSIS could handle, or would that be a manual operation as well? I really, really hope we don't have to do this in code. Each database is several GB and several hundred tables. I can't utilize Red Gate for any of this either, which really stinks.
October 24, 2019 at 8:09 pm
A new schema? You mean this?
CREATE SCHEMA abc AUTHORIZATION dbo;
That would have no effect whatsoever on existing SSIS packages.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 8:14 pm
Correct. If they went' in and added new schema such as the one described above, would SSiS be able to detect that? Or, if they changed a stored procedure etc. I'm just trying to figure out the best solution. I have 2 weeks to get the ETL phase done. Lucky for me I have until the 3rd week in November before we start. I suppose we could utilize WMI Alerts such as the reference here: https://www.sqlservercentral.com/forums/topic/can-ssis-detect-changes-on-500-tables-and-its-added-new-rows
October 24, 2019 at 8:23 pm
SSIS detects nothing.
If a stored proc changes, SSIS will attempt to use the modified version.
If the inputs and outputs to the proc are unchanged, but the internal logic has been updated, all is well. But if the inputs or outputs change, you will need to edit the package to accommodate the changes. I can't imagine a situation where this would ever be handled automatically.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 9:00 pm
This sounds like it could be handled by linked servers and some dynamic SQL and some fun stored procedures to gather all of the schema level data and compare it on the DistDatabase side. But in the same breath, I'm imagining somebody making a table called "DROP DATABASE Database1" and an improperly written dynamic SQL statement just dropping the database...
Do the stored procedures need to be moved over into DistDatabase from Database1 through Database4 as well? If so, SSIS will not be good at that either.
If I am understanding your requirements right, you are updating stuff on DistDatabase to match what is in one of the other 4 databases. So if Database1 gets a new table called [dbo].[table1], you want that table to automagically show up on DistDatabase, correct? Then if a new column gets added to table1, it should automagically show up on DistDatabase as well, right?
If so, how do you handle it when Database2 gets a table called [dbo].[table1] that has a changed set of columns (more or less or differently named or different data types)?
My last thought on this at the moment is are the databases really required to change schema level data that often and make that many new schemas? It might be better to look at how you can design it for table reuse instead of creating new ones frequently (if possible). If that is not possible, the only way I can think to do this (since it is multiple databases all dumping things to a single database) is going to be with linked servers and some fun dynamic SQL and possibly even some cursors... my 3 least favorite features.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
October 24, 2019 at 9:05 pm
This sounds like it could be handled by linked servers and some dynamic SQL and some fun stored procedures to gather all of the schema level data and compare it on the DistDatabase side. But in the same breath, I'm imagining somebody making a table called "DROP DATABASE Database1" and an improperly written dynamic SQL statement just dropping the database...
Do the stored procedures need to be moved over into DistDatabase from Database1 through Database4 as well? If so, SSIS will not be good at that either.
If I am understanding your requirements right, you are updating stuff on DistDatabase to match what is in one of the other 4 databases. So if Database1 gets a new table called [dbo].[table1], you want that table to automagically show up on DistDatabase, correct? Then if a new column gets added to table1, it should automagically show up on DistDatabase as well, right?
If so, how do you handle it when Database2 gets a table called [dbo].[table1] that has a changed set of columns (more or less or differently named or different data types)?
My last thought on this at the moment is are the databases really required to change schema level data that often and make that many new schemas? It might be better to look at how you can design it for table reuse instead of creating new ones frequently (if possible). If that is not possible, the only way I can think to do this (since it is multiple databases all dumping things to a single database) is going to be with linked servers and some fun dynamic SQL and possibly even some cursors... my 3 least favorite features.
Interestingly, none of what you have said relates to SSIS 🙂
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 24, 2019 at 9:33 pm
Correct. If they went' in and added new schema such as the one described above, would SSiS be able to detect that? Or, if they changed a stored procedure etc. I'm just trying to figure out the best solution. I have 2 weeks to get the ETL phase done. Lucky for me I have until the 3rd week in November before we start. I suppose we could utilize WMI Alerts such as the reference here: https://www.sqlservercentral.com/forums/topic/can-ssis-detect-changes-on-500-tables-and-its-added-new-rows
What are you actually trying to do? You could certainly dump all the meta data from each database and do a delta to see what changed. But SSIS won't automatically pick up new things up and move them over.
Now if you're trying to pull specific data and actually do ETL on it to get it into the target then it shouldn't really matter what they add new as long as they aren't changing anything that SSIS directly references. Even View/SP/Function changes won't matter unless they change the output/intput.
October 24, 2019 at 9:35 pm
That is true that it doesn't relate to SSIS directly. My intent with the post was to determine what the end result was and if SSIS could handle it well.
You could use SSIS to run the dynamic SQL but that feels like you are just making more work for yourself.
The answers to the original question, you answered those; I was just looking to dive in a different direction and figure out what is actually needing to be done. I can't imagine a pretty way of using SSIS to automatically move all of the data from a newly created table [dbo].[table1] on Database1 to DistDatabase1 or even creating the new schema using only SSIS. I mean it is possible as you can run TSQL in SSIS and that would get around having to use a linked server, but is it really the right solution to the problem? That being said, I am not sure that a linked server is the right solution to the problem either.
I just wanted to point out the potential problems with picking the tool before fully identifying the problem. When you get a new hammer, every problem looks like a nail.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
October 24, 2019 at 9:50 pm
That is true that it doesn't relate to SSIS directly. My intent with the post was to determine what the end result was and if SSIS could handle it well.
...
The (admittedly obscure!) intention of my post was not to criticize your response, but to highlight the fact that in making things dynamic, SSIS slides out of the equation. I could have been a lot clearer, apologies.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply