August 25, 2009 at 9:32 am
Rowan (8/23/2009)
Please note that you don't necessarily need to use staging tables becasue of complex queries. Moving a pacakge that uses a staging table to not need one should not change the complexitiy of your source queries and any comparision that you would do in T-SQL using staging tables can be done inside SSIS memory without using the database files (much).
Hi John
I don't really understand or know how I would use SSIS memory instead of staging tables(database files).
Is there a certain way of doing this?
Regards
Elliot has answered correctly. That is exactly what I meant. I see many folks using staging tables in the database as part of their SSIS ETL processes when they are most of the time not needed. SSIS is a full ETL (Extract-Transform-Load) tool where you have the ability to perform all of the data transformations before loading the data into the database.
August 25, 2009 at 10:18 am
Thanks.
CEWII
August 25, 2009 at 4:18 pm
Thanks Guys for all the replies.
I am currently currently changing one of my more complex existing packages that does use staging tables to mainly using transformations. There are times where staging tables are needed but these are few and far between, and by using the transformations I can now reduce the use of staging tables substantially.
Thanks again.
Regards
September 4, 2009 at 7:34 am
Hi
I have just one more question, I hope
I have now implemented a complex package without using staging tables and it works prefectly.
I am now busy on another package which is less complex but uses a large volume of data.
I draw data from a DB2 DB (24 mil records)
Is it best to draw this large amount of data into a staging table first, or is it bets to use the DataReader Source in the pipline from the get go.
Thanks
September 4, 2009 at 7:42 am
Probably depends on what you are doing with the data after you suck it out of DB2. If you are just shoving it into a final resting spot perhaps with some simple transforms, I wouldn't use a stage. I might also look for options that allow for effective parallelization of the stream as well.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
September 4, 2009 at 7:45 am
I have a couple questions myself..
1. How often do you run this process (or will you)?
2. Do you need to pull all 24M records EVERY time?
3. If you need all the records can you build a delta process to only get the new or updated ones?
I just can't see pulling 24M each time this runs unless it is 24M new or updated records. I have to question the underlying plan.
CEWII
September 4, 2009 at 8:04 am
The problem with using delta operations is that I am drawing from a DB2 DB and SSIS only allows for a lookup transformation from a SQL Server source, to do an incremental load
I will only need to draw the data once a month.
September 4, 2009 at 8:11 am
Is there some field in the data that reliable shows an update date? If so then keep track of the last date and use it for the query. You don't have to use a lookup in this case, at least not at the beginning, if you use one there then you have to suck all 24M rows down and at that point you have already lost the battle.
We can also take things we know into account, like there are 5-15K records added each day and they have a process date that shows the day they were processed. Also, we know that records in this table are NEVER changed. This gives us a basis to say we can look at the data we have and get the max process date and then ONLY get the stuff after that. We have just limited our pipeline to 5-15K records using just what we know. Clear?
I really think your goal should be to cut down the amount of data over the wire. I've dealt with about as much data as your are talking about in DB2 and pulling it down to SQL. I had to find a way to tell what had changed to get any kind of reasonable performance.
CEWII
September 9, 2009 at 1:48 pm
Is your database on simple recovery mode?
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 9, 2009 at 4:15 pm
No it's in full recovery mode
September 9, 2009 at 5:15 pm
Rowan (9/9/2009)
No it's in full recovery mode
Best if you switch it to simple recovery then; it doesn't look like you are taking transaction-log backups (are you?), and switching to SIMPLE recovery will help keep your log-file size in check, while also alleviating some of the performance issues you are having.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply