April 23, 2009 at 7:12 am
Hi i m very new in SSIS
Table A ( Source)
SidMidFnameAgeAddress
1022Sam2334 Sector
5224John2320 Sector
4235David2322 Sector
1021Sam2334 Sector
1022Sam2354 Sector
Table B (Destination)
SidMidFnameAddress
Table B
Sid and Mid are primary keys
I want to insert values in to “Table B” from “Table A” based on the primary keys.
If the data is not exists in “Table B” I want to insert values from “TableB”
If the data is exists in “Table B” I want to update values from “TableB”
I want to implement this by SSIS with out using Cursor. Because it reduce the performance.
Which component can i use in SSIS , and how ?
Tell me in step by step, I am very new in SSIS
April 23, 2009 at 7:33 am
If both of your tables are in SQL Server, just issue the relevant INSERT and UPDATE queries using Execute SQL tasks ...
Or avoid SSIS completely and use a standard SQL job.
If your source is not on SQL Server, I would suggest loading all of the data into a SQL Server staging table and then doing the INSERTs/UPDATEs from there.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 23, 2009 at 7:39 am
Thanks for replay,
The both table are in sql server itself. how to use Execute sql task for this.
explain me some more.
April 23, 2009 at 7:50 am
You really are new!
I suggest that you search around on the Internet for something like "Creating your first SSIS package" to find the sort of step-by-step detail that you are looking for.
Once you get an idea of the development environment and how things fit together, you will probably be able to work it out for yourself: you just drag it to the Control Flow window and configure a few parameters (connection details and the SQL statement itself at a minimum, IIRC)
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 23, 2009 at 9:56 am
hi,
i created Execute sql task and i retrieve value from source table.
i also tried with insert values into destination table foreach loop container
but i can t do that.
so,how can i completed this by Execute sql task
April 23, 2009 at 5:21 pm
You do not need to extract data into a holding area first - just do it all in one statement, eg (untested):
INSERT into [dbname].[schema].target(Field1, Field2, ... Fieldn)
SELECT Field1, Field2, ... Fieldn from [dbname].[schema].source Source
WHERE Source.PK not in (SELECT PK from [dbname].[schema].target)
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 23, 2009 at 9:47 pm
hi,
thanks for ur reply.
i also tried this query as bellow.becoz i have to check 2 primary columns
[
insert into Source ( batch_id,First_Name,Last_Name,Sid,Mid)
select s.batch_id,s.First_Name,s.Last_Name,s.Street_Address1,s.DOB,s.Sex,s.Subsid,s.Mid from
Source s WHERE not exists
(select f.batch_id,f.First_Name,f.Last_Name, f.Sid,f.Mid from
destination f where f.batch_id=s.batch_id)
]
but, there is problem
1) suppose at the first time, the destination table doesn't have any data.and the source table have new record for a person and updated record for the person then, it insert all the data into destination..
so,i think, i have to insert row by row..
how can i implement this ??
April 23, 2009 at 9:58 pm
No need for row by row, wash your mouth out! 🙂 Just use an INNER JOIN on both fields for the update (run this first).
UPDATE t
Set field1 = s.field1, field2 = s.field2
FROM SourceTable S
JOIN TargetTable T on S.Key1 = T.Key1 and S.Key2 = T.Key2
Then use a LEFT JOIN on both the fields to check for non-existent records in the target table (target PK field will be NULL) so that you know which records to INSERT.
INSERT TargetTable(Field1, Field2)
SELECT S.Field1, S.Field2
From SourceTable S
Left Join TargetTable on S.Key1 = TargetTable.Key1 and S.Key2 = TargetTable.Key2
Where TargetTable.Key1 Is Null
All this code is untested, but hopefully you get the idea.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 24, 2009 at 12:23 am
Hi,
Your Requirement is fulfilled easily in SQLserver2008.
In SQLserver2008 there is a concept called Merge.with this the performane is also improoved.
Try the following query.
MERGE INTO TableB AS Target
USING TableA AS Source
ON Target.Sid =Source.Sid AND Target.Mid =Source.Mid
WHEN MATCHED THEN
UPDATE SET Target.Fname = Source.Fname ,
Target.Age = Source.Age ,
Target.Address = Source.Address
WHEN NOT MATCHED BY TARGET THEN
INSERT (Sid,Mid,Fname,Age,Address)
VALUES (Source.Sid,Source.Mid,Source.Fname,
Source.Age,Source.Address);
--PS:Don't forget ';' (Semicolun) at the end.
To execute the above query you need SqlServer2008. Other wise it'll give error.
April 14, 2010 at 1:43 am
Hi,
a solution you might do using completely SSIS data flow component should be like that :
add a data flow task
in the data flow task add an ole db source for your table A
connect it to a lookup task
in the lookup task configure it to match your table B on your keys field
in the lookup you have three output possible :
Match , no match and error if you ve configured it.
connect the no match to an oleb destination (table b)
connect the match to an execute ole db task for your update.
that's all.
April 23, 2010 at 6:24 am
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.
April 23, 2010 at 6:35 am
Brandon Carl Goodman (4/23/2010)
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.
Unfortunately not everyone has SQL Server 2008 or later 🙁
Those people, including me, are stuck with creating the SCD ourselfves.
@ lbouaziz: you can achieve your proposed solution more easily with the SCD wizard. (Even better, use the SCD Kimball wizard from codeplex.) The only problem with your solution is that every update row gets issued against the database in a seperate query. Not really smart performance wise. What if you have to update one million records? One million different transactions against your database?
(and for the critics: yeah yeah, one million updates against a dimension is very unlikely. But what if it is an initial load and for some reason you have to run the package again? All updates...)
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 23, 2010 at 6:42 am
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.
April 23, 2010 at 6:55 am
Brandon Carl Goodman (4/23/2010)
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.
No need for apologies 🙂
Checksum or hashbytes? Hmm, it depends 🙂
It depends on what you are trying to do. If you only want to update the field that has changed, than a checksum is useless because you won't know which column has changed. If you update an entire row, then why not? But it comes with the overhead of calculating the checksums everytime.
What I usually do is a left outer join between staging table and DWH table on the business key. If the surrogate key of the DWH is null, than it is an insert, otherwise it is an update. I write the inserts to an OLE DB Destination with fast load and the updates to a temp table. Then I do a massive single update against the DWH table. Since you can use the surrogate key to find the records to update, it goes really fast. (especially if you have an index on that surrogate key). This set-up works quite well for me 🙂
However, in SQL 2008 it is better and simpler to use the MERGE statement, as you already indicated.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 23, 2010 at 7:06 am
Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue.:-) Pleasure discussing strategy with you.
Viewing 15 posts - 1 through 15 (of 27 total)
You must be logged in to reply to this topic. Login to reply