April 15, 2011 at 8:29 pm
Hi everyone,
Here is the situation and I need advice from you guys. Every month we receive a file from our client to update any information if they change. Here is the example.
Client_ID, Address,Phone
121,223 Main Street, 451-869
145,458 Jonson Ave, 284-897
121,224 MILLER AVE, 623-3125
Here is the information that I am getting from client File
Client_ID,ADDRESS,PHONE
121,223 MAIN STREET,123-123
145,458 JONSON AVE,284-897
121,224 MILLER AVE, 623-3125
Here Client_ID “121’ he changed his phone number and Client_id # ‘121’ has two houses different address and different phone number, I am using SCD Transformation in SSIS to accomplish this , Here is result that I am getting
CLIENT_ID, ADDRESS,PHONE
121,224 MILLER AVE, 623-3125
145,458 JONSON AVE,284-897
121,224 MILLER AVE,623-3125
Please guide me how I can solve this problem, I really appreciate your help.
Here is the sample tables for test if some one want to test,
Create table Source_File
(
ID VARCHAR(10),
ADDR VARCHAR(30),
PH VARCHAR(15)
)
GO
INSERT INTO Source_File
VALUES ('121','223 MAIN STREET','123-123')
INSERT INTO Source_File
VALUES ('145','458 JONSON AVE','284-897')
INSERT INTO Source_File
VALUES ('121','224 MILLER AVE','623-3125')
GO
Create table Target_File
(
ID VARCHAR(10),
ADDR VARCHAR(30),
PH VARCHAR(15)
)
GO
INSERT INTO Target_File
VALUES ('121','223 MAIN STREET','458-698')
INSERT INTO Target_File
VALUES ('145','458 JONSON AVE','284-897')
INSERT INTO Target_File
VALUE ('121','224 MILLER AVE','623-3125')
GO
April 17, 2011 at 11:55 pm
What are you using as a business key to uniquely identify a record?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 18, 2011 at 5:00 pm
Thanks for reply, I am using ID for Business Key.
April 18, 2011 at 11:29 pm
tooba111 (4/18/2011)
Thanks for reply, I am using ID for Business Key.
Well, there is your problem. It is not a business key, as it does not uniquely identify a record. So of course you will get bad results from the SCD component.
You need to choose a combination of columns that garantuee to identify if a record is unique or not.
In this case that is probably a combination of ID and Address, or maybe ID and Phone.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 19, 2011 at 10:06 am
I am 100% agree with you, My source file doesn't have any Primary Key or unique identifier (It’s a flat file ) but my target table has Primary Key, I have two questions
a)I can use two Business key in SCD if yes, let say if there is some changes in source file with Business key columns that I selected it does change in my target field or not?
b)If I didn’t find any uniqueness to differentiate each columns, How I can add one columns in my source and one column in my target to differentiate each columns for example, Client_ID has “1” I can give him a value “212” how I can give same value “212” in my target Client_ID “1”, Original scenario I have more than 70k rows in my target and source.
Please let me know if my questions is not clear or you need more information. Thanks for your help.
April 20, 2011 at 12:40 am
The questions aren't clear 🙂
Can you elaborate? Preferably with sample input and desired output.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 21, 2011 at 8:18 am
Here is the information that i have in my target table
Client_ID, Address,Phone
121,223 Main Street, 451-869
145,458 Jonson Ave, 284-897
121,224 MILLER AVE, 623-3125
Here is the information that i am getting from client source file
Here is the information that I am getting from client File
Client_ID,ADDRESS,PHONE
121,223 MAIN STREET,123-123
145,458 JONSON AVE,284-897
121,224 MILLER AVE, 623-3125
Note:- In client file Cliennt_ID "121" with "223 Main Street" change his Phone Number,
I am using SCD Transformation (Type 2) to update.
Know my questions are
a) I am using Client_ID as a "Business Key" and i am getting this result
121,224 MILLER AVE, 623-3125
145,458 JONSON AVE,284-897
121,224 MILLER AVE,623-3125
That i don't want,
I tried to use two Business kay, "Client_ID" and "Address" and its taking forever. I am not sure i can use two Business Key in SCD Tranformation?
2) If any one has better approach to accomplish this issue it would be great help for me.
3) If i use "Checksum Transformation" its that helpful ? and how i can configure Checksum Transformation?
Please let me know if still my questions are not clear i can try to explain as many time you want 🙂
April 21, 2011 at 11:35 pm
My advise:
drop the SCD component. It is notoriously slow, and not easy to configure.
Just write out some TSQL statements that will do the logic:
do a LEFT JOIN between source and target, and use Client_ID and Address in the ON clause.
Matching rows are updates, non-matching rows (aka with NULL columns) are inserts.
Write the inserts to the destination with an INSERT INTO statement.
Update the destination table with an UPDATE statement, using a join to determine the correct rows to update (you should have retrieved the primary key of the destination table in the first LEFT JOIN, so this join should be easy and performant).
Forget about the checksum transformation. That is used to distinguish between 'different' updates: actual updates, where some fields have changed, and rows where nothing has changed (but they are classified as updates, as they are not inserts and not deletes). You can use this to make your solution more efficient, but it doesn't solve your initial problem.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply