SSIS Design : Pulling data from a set of tables

  • hi

    we are migrating a datawarehouse application from asp - sql 2000 to dotnet2.0 - sql 2005

    currently we are in the process of coming up with an optimum solution for backend design using sql server 2005.

    This application pulls data from 3 different source systems (oracle,sql server and mainframe db2).

    we have 3 different staging databases corresponding to each of the source application database.

    For eg : RetailGarments application is an oltp application with backend oracle database .

    we have a staging SQL server database called Retailgarments_stg for our datawarehouse application.

    similarly OnlineTravelBooking is an OLTP application with DB2 database .we have corresponding

    OnlineTravelBooking_Staging database for staging this data.

    Each of these source systems have more than 100 tables and we are currently pulling most of the

    tables to our side from the source .

    These staging data from different datasources are accumulated in to a cleansed Schema database

    which is used for Reporting and other OLAP requirements.

    Our question is related to data pull from the source system.

    Current SQL Server database pulls these data from sorce system using Stored proc dynamic queries containing

    link server openquery.

    We would like to improve the performance as part of sql server 2000 to sql 2005 migration

    and would like to use SSIS features for this .

    Stored Proc approch that we currently use for SQL Server 2000 helps us to dynamically frame the query

    through driver tables .(looping through the list of tables for each source system and framing corresponding

    table query using appropriate columns)

    As part of new SQL Server 2005 migration , we were planning to use Source - Destination data transformation methods,

    Can we use foreach loop or for loop for this ?

    Is there any better method for this ..?

    AIM : If we have 90 tables for a source system, we would like to iterate and pull data for 30 tables each

    (3 data pull each having 30 tables) as 3 seperate parallel data pulls.

  • If you're upgrading to 2005, not 2008, I recommend using your Procs as the Source in SSIS. There's better performance and less overhead than doing a table to table pull.

    What happens is that in 2005, a table pull actually "pulls" all the data from all columns before it filters it based on the choices made in the Source editor.

    Now, if you want to pull all columns and all rows, this performance hit doesn't matter because you're pulling it all anyway. But if you're filtering columns and rows, use a stored procedure method. Not only is it more secure than putting the code in the Source for just anyone to see, but the proc can take advantage of stored execution plans and improves performance by only pulling the requested data down to SSIS.

    This issue may have been fixed in 2008 SSIS. Or at least improved upon. But essentially, in SSIS T-SQL is still the highest performing option compared to using most of the tasks and transformations.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply