January 19, 2009 at 2:08 am
Hi all,
i am new to ssis and have to build a package which could update a table from another.
We have one database for cleaning the new data and one database to store all oute data. So we need to import the data from the cleaning db to the store db. The problem here is that we could only import data which is not in the destination db. We can not use some date or so. The only chance we have is to use the primary key.
Currently i use the lookup transformation an insert only rows which cant be found.
Is this the only way to do such an import, or exists a better way.
mfg
flo
January 19, 2009 at 6:35 am
If you are using SSIS then the lookup component is the current method to do this. If the databases are on the same server you could get better performance by doing a set-based insert using either a LEFT JOIN or WHERE NOT EXISTS query like this:
Insert Into destination
Select
S.columns
From
source S Left Join
destination D On
S.primary_key = D.primary_key
Where
D.primary_key Is Null
Or
Insert Into destination
Select
S.columns
From
source S
Where
Not Exists (Select 1 From destination D
Where S.primary_key = D.primary_key)
There is some evidence that option 2 performs a little better.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 19, 2009 at 2:48 pm
Thanks for your replay. i need a very hight performance solution because i have to handle a destination database with ca. 7tb. so my tables are very large.
Could i use the lookup solution as well?
January 19, 2009 at 3:09 pm
The SSIS lookup solution will not perform as well as the set-based T-SQL solution.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 20, 2009 at 5:41 am
How can i build such a query in ssis?
January 20, 2009 at 6:35 am
You would use the Execute SQL Task in SSIS. You can either write the query there or create a stored procedure that you call from the Execute SQL Task (my preference).
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 20, 2009 at 6:51 am
Place tsql in a stored procedure; makes it easy for updating. Would also use a left join over and if exists approach.
select id from toTable t
left join fromTable f on
t.id = f.id
where f.id is null
January 20, 2009 at 7:01 am
if i say that i have to transform some data types you will killing me, right? 😉
January 20, 2009 at 7:08 am
florianb86 (1/20/2009)
if i say that i have to transform some data types you will killing me, right? 😉
I would say welcome to SSIS. Read up and use a data transform task.
January 20, 2009 at 7:13 am
yes i would, but then i can't use a stored procedure because my transformation output can't be an input for the stored procedure, right?
January 20, 2009 at 7:17 am
I would import into a staging table. Your SP will reference the staging table and final distination table. This will also help if the package fails; although the left join will prevent duplicated data.
January 20, 2009 at 8:39 am
i have build my stored procedure but i get always this error:
Msg 2627, Level 14, State 1, Procedure import_DB_PARAM_HIST, Line 14
Violation of PRIMARY KEY constraint 'PK__DB_PARAM_HIST__29ADC38E'. Cannot insert duplicate key in object 'dbo.DB_PARAM_HIST'.
here is my statement:
INSERT INTO [SAP_PERF_HIST].[dbo].[DB_PARAM_HIST]
SELECT.[PARTYPE] ,
.[HOSTNAME] ,
.[SYSTEMID] ,
.[PARNAME] ,
.[PARDATE] ,
.[OLDVALUE] ,
.[NEWVALUE] ,
@customerid AS customerid
FROM [PERF_SAP_CHECK_1].[dbo].[DB_PARAM_HIST] AS
WHERE NOT EXISTS(SELECT[D].[HOSTNAME] ,
[D].[SYSTEMID] ,
[D].[PARNAME] ,
[D].[PARDATE] ,
[D].[CUSTOMER_PERFORMID]
FROM [SAP_PERF_HIST].[dbo].[DB_PARAM_HIST] AS [D]
WHERE[D].[HOSTNAME] = .[HOSTNAME] AND
[D].[SYSTEMID] = .[SYSTEMID] AND
[D].[PARNAME] = .[PARNAME] AND
[D].[PARDATE] = .[PARDATE] AND
[D].[CUSTOMER_PERFORMID] = @customerid)
this statement will fail although the destination table is empty.
January 20, 2009 at 8:56 am
You can't have duplicate PKs. I think you want something like this.
INSERT INTO [SAP_PERF_HIST].[dbo].[DB_PARAM_HIST]
([PARTYPE] ,
[HOSTNAME] ,
[SYSTEMID] ,
[PARNAME] ,
[PARDATE] ,
[OLDVALUE] ,
[NEWVALUE],[CUSTOMER_PERFORMID])
select
[D].[HOSTNAME] ,
[D].[SYSTEMID] ,
[D].[PARNAME] ,
[D].[PARDATE] ,
[D].[CUSTOMER_PERFORMID]
FROM [PERF_SAP_CHECK_1].[dbo].DB_PARAM_HIST] AS [D]
left join [SAP_PERF_HIST].[dbo].[DB_PARAM_HIST] AS on
[D].[CUSTOMER_PERFORMID] = .[CUSTOMER_PERFORMID]
where
.[CUSTOMER_PERFORMID] is null
January 20, 2009 at 9:06 am
i found the problem. some one broke my source table and so i had duplicates. after cleaning this table my stored procedure had no errors.
January 20, 2009 at 9:13 am
I would compare the not exists query with a left join query. Exists should be used to start a process not compare data; imo.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply