January 6, 2009 at 1:15 pm
Good Day
I have used the Execute SQL Task to query a database with the Full Result Set option.
I now want to use this in my dataflow task for my OLE Source using Data Access Mode : Source from Variable
Is this possible
E.g
I want to run the query below once and obtain a result set.
select x from table1 where change = today
union
select x from table2 where change = today
...
select x from table10 where change = today
Then I want to be able to extract only records that are in the result set ( for my OLE DB Source)
So my query would be
select * from table1 where x in result set
Thus if a record changes in any of the tables (e.g table5), I want to extract the related data
from all other tables (ie table1 - table10)
I am not sure how to implement this or if this is the way to do this.
Thanks in advance
LoxyGold
January 9, 2009 at 12:45 pm
I'll say upfront, I haven't looked in to the Change Data Capture that's in SQL 2K8 but if you're running that version, you may well want to look in to that functionality for capturing row level change information.
The way I understand your question/issue, what you really need is to identify if there is a changed record in any table (say tables 1 thru 10), and then, based on the fact that *something* in one of those tables (not necc all) changed, you want/need to extract all related rows from tables 0 thru 10.
Assuming that's the case, and then assuming (possibly a big assuption) that you have say a key that is common across all tables, then you simply need to write a query (much like your union) such as
select distinct ID from
(
select ID from table_1 where change = today
union all
select ID from table_2 where change - today
union all .....
) thedata
by performing the distinct outer query, you get only one instance of ID even if it was changed in multiple tables.
Personally, i'd push the result of this to a temp table, but theoretically you could push it in to a recordset type object in (memory in) SSIS.
Then, for the extraction query, you simply need to either have 1 big query that spans all of your tables, or (say) 10 queries, one for each table.
Now, assuming that my assumption is incorrect (ie you *don't* have a key thats common across all tables) then you'll be doing basically the same thing, but in your 'did something change' queries, you'll need to start in the 'target' table and then join back to a base/common/source table to retrieve the ID. To explain this a little better, using a simple example of Customers, Orders and LineItems as tables. Customer and Order share a common key (Customer_ID) and Order and LineItems share a common key (Order_ID).
Querying Customers for changes is easy, simply select Customer_ID from Customers where change = today.
Querying orders is also easy - select distinct Customer_ID from Orders where change = today
Querying LineItems is a little longer but still easy - select distinct o.Customer_ID from Orders o inner join LineItem li on o.Order_ID = li.Order_ID WHERE li.change = today
The queries to get the changed table data then become (assuming i inserted the 'changed' Customer_ID's to a table 'tmp')
Customers - Select * from Customers where Customer_ID in (select Customer_ID from tmp)
Orders - select * from Orders where Customer_ID in (select Customer_ID from tmp)
LineItems - select * from LineItem where Order_ID IN (select distinct order_id from Orders where Customer_ID IN (select Customer_ID from tmp))
Obviusly, with 3 tables, it's pretty simple, but with a real world schema, it gets complex 🙂 Also, you could possibly/probably see performance improvements by utilising other methods for the where clauses (e.g. use joins instead of sub queries etc).
HTH,
Steve.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply