Bring back data from multiple servers and multiple databases into one location

  • So we currently have 18 instances across 16 servers with approximately 350 databases per instance. I have to run multiple stored procedures against each database and the combine the data and bring it all together into one database. Each database has the same schema.

    The stored procedures can take any where from 1 second to 45 seconds depending on the size of the database.

    I am trying to come up withe the best way to handle this. On top of this I am trying to create some flat tables to speed up the stored procs which also will have to be updated regularly (twice a week).

    I am new to this environment so I am little out of sorts with how to handle the scale. Does any one have any ideas? I can certainly create a job that would loop through each database and run the stored proc and then insert the results into multiple tables but I wasn't sure if there was something else I could do that would be less invasive.

  • pamozer (4/30/2012)


    So we currently have 18 instances across 16 servers with approximately 350 databases per instance. I have to run multiple stored procedures against each database and the combine the data and bring it all together into one database. Each database has the same schema.

    This falls under my usual header of "Oh god that hurts".

    The stored procedures can take any where from 1 second to 45 seconds depending on the size of the database.

    I assume this stored procedure already exists and is deployed?

    I am trying to come up withe the best way to handle this. On top of this I am trying to create some flat tables to speed up the stored procs which also will have to be updated regularly (twice a week).

    One way or another, you're going to end up looping. The question is, what gives you the best options for looping (... wait for it).

    I am new to this environment so I am little out of sorts with how to handle the scale. Does any one have any ideas? I can certainly create a job that would loop through each database and run the stored proc and then insert the results into multiple tables but I wasn't sure if there was something else I could do that would be less invasive.

    For starters, there's always good ol' sp_MSForEachDB. That's a per server call. However, that doesn't necessarily deal with your multiple servers and you'll need robust wrappers to deal with any databases that DON'T fall into your schema... like system databases.

    So, what's the option? My personal favorite: Create your list of server/database that you need to run this against. Create a ForEach loop in SSIS, and drop this list into it as a parameter set for SSIS variables into expressions against your OLEDB connection, in particular the server and initial catalogs. This lets you create a custom list of exactly the servers and databases you want to deal with.

    From there, you can do a simple dataflow calling the proc (or putting the code right there) and dumping it to whatever storage table you need.

    This task requires some maintenance in that any time you bring in new databases of this schema you have to update the source table(s) for your loop, but then it's just a matter of a data change, not a code change.

    After that, 45 seconds is a LOT of time for a proc, but I have no idea of your data volume and scenario. I'd also look into optimizing that, however, unless you've already wandered up that path a few times.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • The procedure is actually a modification of another report procedure that was taking over 2 minutes to run. So I have got it down quite a bit. The problem is aggregating millions of rows of data. I haven't been able to get it any faster when I am dealing with the millions of data. The smaller sets of data are really fast. Some faster than 1 sec.

    I did create a table on each server already that gathers all the databases that we would want to run these procedures on so. I wanted to to see if any one had any alternatives. But I guess I am heading in the right direction.

    Thank you for your suggestions.

  • In my ssis package how do I change the database that the procedure is running on in the for each container. I have an Execute SQL Task that populates a variable with all of the databases. But How do I use that variable in the foreach container to change the database?

  • pamozer (4/30/2012)


    In my ssis package how do I change the database that the procedure is running on in the for each container. I have an Execute SQL Task that populates a variable with all of the databases. But How do I use that variable in the foreach container to change the database?

    Capture your Execute SQL Task output in a variable of type Object, and this will represent an ADO.NET DataSet under the covers. You'll then use the ForEach ADO Enumerator in your ForEach Loop Container to iterate over the variable containing the DataSet, and assign the values from the DataSet to another variable, per the Variable Mapping Page". Then you can use the second variable I mentioned that receives the values from the DataTable to build your EXEC statement used in your Data Flow Task to execute said proc.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • and assign the values from the DataSet to another variable, per the Variable Mapping Page". Then you can use the second variable I mentioned that receives the values from the DataTable to build your EXEC statement used in your Data Flow Task to execute said proc.

    I know I must be missing something simple. I am not sure I understand this piece of it. And what I should be assigning the variables. How I can use them to change what database I am working with. Do I still need to create dynamic sql and use the variable there? I would think I would n't have to do it that way in here. But maybe that is the case.

  • pamozer (5/1/2012)


    and assign the values from the DataSet to another variable, per the Variable Mapping Page". Then you can use the second variable I mentioned that receives the values from the DataTable to build your EXEC statement used in your Data Flow Task to execute said proc.

    I know I must be missing something simple. I am not sure I understand this piece of it. And what I should be assigning the variables. How I can use them to change what database I am working with. Do I still need to create dynamic sql and use the variable there? I would think I would n't have to do it that way in here. But maybe that is the case.

    Say we have a variable named DataSet of type Object that catches the output from your first Execute SQL Task. Then, your Foreach Container will look like this:

    And say you have a variable named RetrievedColumn, on the Variable Mappings page of the Foreach Container you'll map the column in DataTable 1 of your DataSet in the proper ordinal position (0 - based) to your variable like so:

    Now you can use RetrievedColumn in an Expression to build SqlStatementSource in the Execute SQL Task in the Foreach Container.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • pamozer (4/30/2012)


    In my ssis package how do I change the database that the procedure is running on in the for each container. I have an Execute SQL Task that populates a variable with all of the databases. But How do I use that variable in the foreach container to change the database?

    Well, as Orlando described you'll want to return your list to a recordset object in the package and then loop on the recordset results. However, the way to change server/database is to use expressions against the connection object to the procedure databases and change the initial catalog and server settings for each pass.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thank you both. I did finally figure out to change the initialCatalog to use the variable and that seems to be working.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply