Running same SQL code against a number of tables

  • I have a number of tables (around 40) containing snapshot data about 40 million plus vehicles. Each snapshot table is at a specific point in time (the end of the quarter) and is more or less identical in terms of structure (over the years extra variables have been added, but on the whole, most of the tables are identical).

    Whilst most of our analysis is against single snapshots, on occasion we need to run some analysis against all the snapshots. For instance, we may need to build a new table containing all the Ford Focus cars from every single snapshot.

    To achieve this we currently have two options:

    a) write a long, long, long batch file repeating the same code over and over again, just changing the FROM clause [drawbacks - it takes a long time to write and changing a single line of code in one of blocks requires fiddly changes in all the other blocks]

    b) use a view to union all the tables together and query that instead [drawbacks - our tables are stored in separate database instances and cannot be indexed, plus the resulting view is something like 600 million records long by 125 columns wide, so is incredibly slow]

    So, what I would like to find out is whether I can either use dynamic sql or put the SQL into a loop to spool through all tables. This would be something like:

    for each table in TableList

    INSERT INTO output_table

    SELECT table as OriginTableName, Make, Model

    FROM table

    next table in TableList

    Is this possible? This would mean that updating the original SQL when our client changes what they need (a very regular occurrence!) would be very simple and we would benefit from all the indexes we already have on the original tables.

    Any pointers, suggestions or help will be much appreciated.

    Many thanks in advance,

    Daryl

  • Dynamic SQL should do the job given required columns exists in all tables with the same name. Even if the columns names are different in some tables, you may be able to include the column name in your list table and build the query from there.

  • Another way, using sp_msforeachtable:

    create table dbo.OutputTable (OriginTableName nvarchar(500), RecordCount INT)

    create table dbo.TableList (Name nvarchar (500))

    insert dbo.TableList

    select '[dbo].[swap]'

    union select '[dbo].[products]'

    union select '[dbo].[structures]'

    union select '[dbo].[stagingdata]'

    exec sp_msforeachtable @command1 = 'INSERT INTO dbo.OutputTable SELECT ''?'', COUNT(*) from ?'

    ,@whereand = 'and syso.object_id in (select object_id(Name) from dbo.TableList)'

    select * from dbo.OutputTable

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I've not tried it yet, but your suggestion looks perfect, Eugene. It is exactly what I hoped that someone would come up with - I can build a list of tables and it looks like sp_msforeachtable basically gives me an inbuilt looping function.

    I'll try it out later this week and report back, but I am very hopeful that this will fit the bill.

    EDIT

    I have just built my table list and tested it for some simply queries. It works perfectly and gives me precisely what I needed. Many thanks for your help.

  • I'm glad it did help you.

    But it worth to mention that I would just use normal cursor and dynamic sql. That is exactly the case where using cursor is totally justified.

    Undocumented sp_msforeachtable does exactly the same.

    If you want to see how it works just execute:

    sp_helptext sp_msforeachtable

    Your cursor should be for SELECT TableName FROM dbo.TableList, then inside of cursor you can build required SQL string and execute it.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • One thing that I have noticed is that I can not use a GROUP BY clause with sp_msforeachtable. This would be very useful for other, related, queries, so perhaps I do need to look at cursors to achieve that. Now that you've given me a pointer to this, hopefully it will not take me long to learn how to achieve the same with a cursor and expand that number of things I can do with the query. (plus write some very basic code that my less-SQL Server savvy colleagues can follow)

    Thanks again.

  • Just for the record, I used Dynamic SQL with the TableList table of table names. As a very simple example, this looked like:

    DECLARE @TableName varchar(500)

    DECLARE @curTable CURSOR

    DECLARE @sql NVARCHAR(1000)

    SET @curTable = CURSOR FOR

    SELECT [Name] FROM Vehicles_LookupTables.dbo.AllStockTableList

    OPEN @curTable

    FETCH NEXT

    FROM @curTable INTO @TableName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @sql = 'SELECT ''' +@TableName + ''', Make, sum(1) as Total FROM ' + @TableName + ' GROUP BY Make'

    PRINT @sql

    EXEC sp_executesql @sql

    FETCH NEXT

    FROM @curTable INTO @TableName

    END

    CLOSE @curTable

    DEALLOCATE @curTable

    GO

  • Daryl Lloyd (4/16/2012)


    ...

    b) use a view to union all the tables together and query that instead [drawbacks - our tables are stored in separate database instances and cannot be indexed, plus the resulting view is something like 600 million records long by 125 columns wide, so is incredibly slow]

    ...

    Don't outright dismiss the partitioned view option without giving it a try. In this case it's called a Distributed Partitioned View, because it unionizes multiple tables across multiple databases, and it's been proven to work in many scenarios.

    Distributed Partitioned Views / Federated Databases: Lessons Learned

    http://blogs.msdn.com/b/sqlcat/archive/2007/06/20/distributed-partitioned-views-federated-databases-lessons-learned.aspx

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho


  • Eric M Russell (4/18/2012)

    Don't outright dismiss the partitioned view option without giving it a try. In this case it's called a Distributed Partitioned View, because it unionizes multiple tables across multiple databases, and it's been proven to work in many scenarios.

    Distributed Partitioned Views / Federated Databases: Lessons Learned

    http://blogs.msdn.com/b/sqlcat/archive/2007/06/20/distributed-partitioned-views-federated-databases-lessons-learned.aspx

    We make limited use of this type of partition, and, as you say, it certainly does work. However, I think the problem in our situation is that we have to union 20-odd tables each containing 30 million plus records and (for some reason that I cannot recall at the moment) are unable to make use of any indexes. So it produces the right answer but at a terribly slow speed. I estimate that it takes roughly 5 to 10 times longer to run a unionised view query in comparison with a set of sequential queries on separate tables.

    So I suspect that this type is view is very useful, but only the in the right context. Like most things in life, I suppose ;-). I will have a look at the link above, though, to learn a it more about Distributed Partitioned Views (ands perhaps I'll find something key that we've be doing wrong!).

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply