August 19, 2010 at 3:44 pm
Wanted to use lookup transformation for the following:
- Pull records from oledb source
- Lookup in oledb destination first to see if record exists
- if record does not exist, send to destination as insert.
The result just dumps everything into the destination when the Destination is Empty(In my case it is a blank table in the starting)
I have tried no cache option also but thats not working too...as there are duplicates...
I don't want any duplicates in my loookup column....
PLEASE HELP...THIS IS KILLING ME
August 19, 2010 at 6:32 pm
The only solution I have found till now is this
I am sending this data to a temp table instead of sending it to a Destination table and then I have created one more data flow task for doing SELECT DISTINCT FROM TEMP TABLE.
Let me know if there is a better way to do this.
August 20, 2010 at 12:34 am
Where exactly are there duplicates?
In your source or at the destination?
If there are duplicates at the source, performing a lookup while you are writing to the table destination will not work perfectly, as you are inside a transaction.
I think the best method is making sure you write a good SQL SELECT statement to get the data in the OLE DB Destination so that there are no duplicates to begin with.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
August 20, 2010 at 9:43 am
This is simple..
You have to use a SORT transformation and de-duplicate on LOOKUP columns.
You can remove duplicates either before using lookup Tx or after that before inserting into your destination.
Thanks [/font]
August 20, 2010 at 11:10 am
I am sorry ssis learner..But, I didn't understand how Ican achieve this by sort transformation
August 20, 2010 at 11:13 am
da-zero (8/20/2010)
Where exactly are there duplicates?In your source or at the destination?
If there are duplicates at the source, performing a lookup while you are writing to the table destination will not work perfectly, as you are inside a transaction.
I think the best method is making sure you write a good SQL SELECT statement to get the data in the OLE DB Destination so that there are no duplicates to begin with.
That is actually one of the worst things you can do (it works though), as the SORT component is a blocking component. If your source contains 1 million rows, you'll have to load them all in memory and then sort them. There are much better alternatives that don't hit performance so hard.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
August 20, 2010 at 11:15 am
SSIS.COM (8/20/2010)
I am sorry ssis learner..But, I didn't understand how Ican achieve this by sort transformation
There is an option at the SORT transformation to remove duplicates. But see my previous post to see why you should avoid this transformation.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
August 24, 2010 at 12:58 pm
I have done this before with SSIS 2005, it is easier with SSIS 2008..
In the Lookup comopnent for 2005 you have 2 outputs, found and error. In 2005 I have generally used the error to indicate NOT found and use that for inserts, where I use the found as updates. You can get it to do this by setting the error as redirect.
Is that clear?
In SSIS 2008 there is three outputs, found, not found, and error.. much better.
CEWII
August 24, 2010 at 2:57 pm
Thanks to Elliott and everyone who reponded.
Here is a brief description of my problem
Initially my destination(reference) table is empty... I am trying to load data from my staging table into a Customer Dimension table(dimension table so, each customer will have only one record with their customer no, name etc )...
Here is my sample input data from staging
custName CustNumber Custemail CustLocation CustLocNumber
A 1 XX H 1
A 1 XX D 2
B 2 YY M 3
B 2 YY C 4
C 3 ZZ B 5
And Here is my customer Dimension Table(My destination&Refernce Table)(and this is empty)
CustName CustNumber Custemail
So, when I look up CustNumber in both these table since there is no data in the reference table everything goes to the NO MATCH OUTPUT. Since these are new records I want to insert these records into the Destination(Ref) table. But, when I do this all my records are getting inserted into the Dim table like
custName CustNumber Custemail
A 1 XX
A 1 XX
B 2 YY
B 2 YY
C 3 ZZ
IS there a way I can get a result in Dimension table like
custName CustNumber Custemail
A 1 XX
B 2 YY
C 3 ZZ
Let me know if I am confusing you again.
August 24, 2010 at 3:51 pm
I'd probably change my select query from the staging and do a distinct and only include the fields I needed. This should weed out dupes and make the database engine to the work for you without making SSIS do it.
CEWII
August 25, 2010 at 7:22 am
Thanks Elliott...I think that is the best solution.....
August 25, 2010 at 9:31 am
you are very welcome, glad I could help.
CEWII
December 4, 2012 at 2:46 am
SSIS.COM (8/24/2010)
Thanks to Elliott and everyone who reponded.Here is a brief description of my problem
Initially my destination(reference) table is empty... I am trying to load data from my staging table into a Customer Dimension table(dimension table so, each customer will have only one record with their customer no, name etc )...
Here is my sample input data from staging
custName CustNumber Custemail CustLocation CustLocNumber
A 1 XX H 1
A 1 XX D 2
B 2 YY M 3
B 2 YY C 4
C 3 ZZ B 5
And Here is my customer Dimension Table(My destination&Refernce Table)(and this is empty)
CustName CustNumber Custemail
So, when I look up CustNumber in both these table since there is no data in the reference table everything goes to the NO MATCH OUTPUT. Since these are new records I want to insert these records into the Destination(Ref) table. But, when I do this all my records are getting inserted into the Dim table like
custName CustNumber Custemail
A 1 XX
A 1 XX
B 2 YY
B 2 YY
C 3 ZZ
IS there a way I can get a result in Dimension table like
custName CustNumber Custemail
A 1 XX
B 2 YY
C 3 ZZ
Let me know if I am confusing you again.
I have slightly different requirement here.May be you people can help me out.
Initially my destination(reference) table is empty... I am trying to load data from my staging table into a Customer Dimension table(dimension table so, each customer will have only one record with their customer no, name etc )...
Here is my sample input data from staging
custName Custid Type
A XX1 Hid
A XX Did
B YY1 Hid
B YY Did
C ZZ1 Hid
And Here is my customer Dimension Table(My destination&Refernce Table)(and this is empty)
CustName Custid
So, when I look up CustNumber in both these table since there is no data in the reference table everything goes to the NO MATCH OUTPUT. Since these are new records I want to insert these records into the Destination(Ref) table.
IS there a way I can get a result in Dimension table like mentioned below ?
custName Hid Did
A XX1 xx
B YY1 yy
C ZZ1
The logic would be as follow..
1.only one record should be in table
2.If type is Hid it should update the Hid column from Custid of staging and if type id Did it should update the same record in Did column with Custid.
3.Need to iplment this using Look up only.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply