July 17, 2013 at 1:24 am
Question :1
Server : USING sql server 2008 R2
a) I have "prestaging" table as source table.
b) I kept working table and History table as other set.
c) The data from "prestaging" tables will move to both "working tables" and "History tables".
d) The working table record have few validation is there after validated then all the successful record will move to the "TARGET" table.
d) There is possibility of the old data can get amended from source system and then they will send the amended record in the "Prestaging tables " with the UNIQUE no column as primary key.
d) Before inserting in to target table I need to compare the history table if the same record is appearing again based on the unique column then I need to Update "Modeflag" column as update record else the record will be send as "Inserted"
d) There is possibilities of comparison of history table will have more than 500000 of records.
d) Currently we are comparing with inner join with UNIQUE no as clustered index column and then updating the mode-flag .
we required best possibilities way to avoid performance issue to handle this situation .
Kindly provide the best solution to compare the history record .
Note :THE BOVE ALL LOGIC WRITTEN ON ONE STROED PROCEDURE AND THEN SQL JOB WILL EXECUTE ON EVERYDAY.
July 17, 2013 at 1:37 am
1. If everything needs to be in a stored procedure, why ask your question in an SSIS forum?
2. No caps. There's no need to shout.
3. Don't use lakhs. This is an international forum. More people will understand your question if you say 500000.
4. I would use a LEFT OUTER JOIN with the history table to find out which rows are updates or not.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 17, 2013 at 2:44 am
Over the period the no of rows can increase more than 500000 , whether LEFT outer join will increase the performance or any other option are available?
Is it Partition of table will work out ? Please advice.
July 17, 2013 at 3:27 am
If you properly index the tables according to the join predicates, 500000 rows shouldn't be an issue.
On what would you partition the table? On the unique key? That wouldn't make much sense.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 17, 2013 at 7:16 am
Hi,
The record is not 500000 only it can grow like up to 10000000 etc..
Please suggest the best practice to avoid performance issue for the below which one I can prefer.
1) Partitioning based on the Duration (Year or month etc..)
2) Is that instead of Left outer join can I use Inner join for comparison? I feel inner join is the best practice isn't it?
3) Merge statement will it be useful for Comparison and update the record.
I appreciate if you suggest the best possible method
Regards
Anand j
July 17, 2013 at 2:03 pm
1. If you are already joining on the Unique No column, which is the clustered index, I don't see why partitioning the table on another column would improve performance (if it does, please enlighten me, always ready to learn)
2. Why would inner join be more best practice than a left join? A join is a join and you use the right type of join to get the results you want. A left join tells you which rows are updates (join matched) and which rows are inserts (join did not match, so columns of right table are NULL). An inner join only tells you which rows of the left table are present in the right table, so it gives you only the updates.
3. MERGE will take your source table and merge it into your destination table. Your scenario however is that you have a source table, a destination table and a lookup table (the history table). The lookup table tells you which rows are updates and inserts, so you kind of have a different scenario.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 17, 2013 at 11:44 pm
HI,
Thanks .
Still i have few concern on your clarification .
1) Your assumption unique no column does not have clustered index . The unique column is common column between both (Working table) and (History Table).We are using "unique no" column as common connector for both the tables .
2) My Requirement is to compare the UNIQUE column exist in working table is available in History table or not . Working table data will have always fresh set of data. If unique no exists I will update the "TARGET TABLE " modeflag column as updated record.
In this scenario still you prefer left outer join or inner join . Your valuable suggestion are welcome.
I appreciate the quick response on the above query .
Thanks once again.
July 18, 2013 at 12:58 am
In your first post you said Unique No is the primary key. By default in SQL Server this is a clustered index, unless you specify otherwise at creation time.
Can you give some table definitions and sample data, as described in the first link of my signature. That way it will be easier to find the correct SQL statements to use.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply