UPSERT USING SSIS 2005

Question

UPSERT USING SSIS 2005

rka

SSCommitted

Points: 1599
More actions
August 4, 2009 at 11:59 pm

#140564

While building the datawarehouse solution for my company, I have frequently come across the situation where I have to update and Insert data from our ODS to the Datawarehouse. Even though we can achieve this via Slowly Changing Dimensions Task or Custom Scripts, attached is a simple tutorial that outlines the procedures that can also be followed to implement this
Have fun

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184394 More actions · Answer 1

Why not submit this to SSC as an article? The forums are usually used for questions.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

GSquared SSC Guru Points: 260824 More actions · Answer 2

Forums are fine for whatever, but I second the idea of submitting it as an article.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

John Rowan SSC Guru Points: 56440 More actions · Answer 3

For what it's worth, I third the motion to see this in an article. Have you done any time/performance comparisons on this method versus the SCD transformation?

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

Yep! Fourth here! Add a bit of a write up to go with all the great pictures and submit that bad boy as an article. I'm sure that it'll become one of those articles that all the big dogs direct folks to. Nicely done.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

John Rowan SSC Guru Points: 56440 More actions · Answer 5

How are you handling when more than one column's value is updated?

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 6

John Rowan (8/5/2009)
How are you handling when more than one column's value is updated?

I am interested in this as well...I import files where potentially any one (or many) of 40 + columns may have changed....I have no prior knowledge of which column(s).

many thanks Graham

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 7

Hey there Ritz... I sure do hope you're still around because I need some help that your very good Word document just isn't helping me with. Would you take a look at the following post of mine and see if you can lend a hand? I sure would appreciate it. Thanks

http://www.sqlservercentral.com/Forums/Topic858273-148-1.aspx

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

wschampheleer SSCertifiable Points: 5504 More actions · Answer 8

For those on SSIS 2008: you might consider using the MERGE statement offering you less work and better performance.

Basically, your package will look like this:

Data Flow Task: Read Data From Source and write it to staging table.

Execute SQL Task: MERGE INTO final table USING staging table.

As a side note: beware of Columns that can be NULL when comparing them in the Conditional Split!

Regards,

Willem
http://wschampheleer.wordpress.com[/url]

kummu123 SSC Journeyman Points: 81 More actions · Answer 9

I am using Slowly changing dimension wizard to load my staging database on a daily basis...the prb is while im running my packg with large tables(like 100 thousand records) it takes lot of time ...is there any way to speed up pakg...plz help....

wschampheleer SSCertifiable Points: 5504 More actions · Answer 10

This is a known problem with the SCD transform.

Your alternatives are:

- use the Kimball Method SSIS Slowly Changing Dimension Component[/url]

- instead of doing the updates via OLE DB CMD, use an OLE DB Destination to write to an intermediate table and do the update using an Execute SQL Task in the Control Flow (set based instead of RBAR, well explained in this book).

- do not use a wizard but use lookups and conditional splits (also explained in the above book)

- when using SSIS 2008: use the merge statement (my preferred approach) - here is a nice (but unfortunately rather incomplete) article to get you started

Regards,

Willem
http://wschampheleer.wordpress.com[/url]

MyDoggieJessie SSC-Forever Points: 44486 More actions · Answer 11

I have to ask this question:

In poorly designed tables that have 400+ columns (but luckily do have a unique guid as a PK), wouldn't it just be easier to check to see if the uid exists in the table, if it does DELETE it, if not INSERT it?

Rather than having to perform and update on various columns (i.e. 1 col - 400+ cols)???

______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

rka SSCommitted Points: 1599 More actions · Answer 12

Instead of using the conditional split to explicitly determining the columns that have been changed, you can use a SQL Command with an UPDATE statement to the entire table

i.e.

UPDATE TableName

SET

Column1 = ?,

Column2 = ?,

Column3 = ?,

Column4 = ?,

Column5 = ?

...etc

WHERE ID = ?

And complete the mapping

This will take care of any updates to the columns

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 13

MyDoggieJessie (4/11/2011)
I have to ask this question:
In poorly designed tables that have 400+ columns (but luckily do have a unique guid as a PK), wouldn't it just be easier to check to see if the uid exists in the table, if it does DELETE it, if not INSERT it?
Rather than having to perform and update on various columns (i.e. 1 col - 400+ cols)???

I didn't know that was an option. My answer would be "Yes, most definitely".

Just be aware that ALL DELETE and INSERT triggers will fire and ALL indexes will be updated twice.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)