Which is faster Truncate and Load or Slowly Changing

Question

Post reply

Which is faster Truncate and Load or Slowly Changing

good_c_ryan_19

SSC-Addicted

Points: 422
More actions
September 1, 2011 at 1:57 am

#256696

Hi I have this problem what among those two ETL choices will i create?
I know that Truncate and Load will Erase all records from Table and Load All records from source to it.
And Slowly Changing insert All records if it is new and Update records if it is changed.
But i don't know exactly which is faster?
Thanks in Regards

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Iulian -207023 SSCertifiable Points: 7639 More actions · Answer 1

TRUNCATE TABLE removes the data by deallocating the data pages used to store the table's data

it is preety fast. I am not sure what do you mean by Load.

Regards,

Iulian

good_c_ryan_19 SSC-Addicted Points: 422 More actions · Answer 2

Actually I am going to Create an Integration Services or ETL...and i don't know which is better to use and who will finish faster.

Specifically My Source is a Table from ERP Server(Which contains millions of Records) and my destination is the replicate of that table on Datawarehouse Server.

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 3

how many tables, how many rows and how many rows have been changed since last load?

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

TomThomson SSC Guru Points: 104773 More actions · Answer 4

In the slow method you don't mention deleting records that are no longer present in the source; presumably you have to do that - or is nothing ever deleted?

Whether truncate followed by load from source is quicker that insert new, update changed, and delete missing from source will depend on how many records are the same, how many are changed, how many are deleted, and how many are new, and what indexes exist on the target table, whether it's a heap or clustered and what sort of page splits (if any) will be caused by deletes and updates.

Unless you give some sort of estimates for the numbers the only answer anyone can give you is £it depends", and even with the numbers it may not be obvious without details of the schema. In fact your best bet is probably to run some tests - try doing truncate and load and see how long it takes; then try doing it with a merge statement and see how long it takes - but don't belive the results of a single comparison, do enough tests to give you an idea of speed over the range of mixes of cretae/update/delete/leave alone that you are likely to encounter in real life.

Tom

Daniel Bowlin SSC-Dedicated Points: 34566 More actions · Answer 5

From a design point of view they really have different purposes.

But to answer your question the SSIS SCD transform is a dog and not generally recommended. There is a much better one on codeplex by the Kimball Group. PragmaticWorks has also taken the Kimbal SCD and tweaked it a bit to turn it into a commercial product.

So back to basics. Do you need or want to maintain any historical data in the table? I suspect the answer is no because you are considering truncate. If you need to maintain historical data that will exclude a truncate and force you into some sort of SCD transformation. Whether you use the existing one because you don't really have to worry much about scale in your environment, or get a 3rd party one, or even build your own it doesn't matter. If you need to maintain historical data you need to have something that will do that.

So that brings us to the third possibility, you don't need to maintain historical data and you are asking if a truncate and insert will be faster than an incremental load. I would generally say that an incremental load would be the way to go because it will absolutely scale better. An incremental load however would require some lookups or merge joins and some conditional splits as well as 2 destinations(an update destination, and an insert destination) so it is a bit more complicated than an Execute SQL Task with a truncate command followed by a data source and a data destination.

Without trying to be insulting it sounds like you need to do a bit of reading on how to use the different SSIS transforms so you can figure out the best approach because comparing a truncate to an SCD is comparing apples and oranges because their fundamental purpose is different.

good_c_ryan_19 SSC-Addicted Points: 422 More actions · Answer 6

Thanks A LOT!

I would go to Slowly Changing Dimension some of our application needs look up to that Table from Datawarehouse. We get some issues about it where we cant get data when Job is Running.

rajughy_ghy Old Hand Points: 310 More actions · Answer 7

DELETE is a logged operation on a per row basis. This means

that the deletion of each row gets logged and physically deleted.

You can DELETE any row that will not violate a constraint, while leaving the foreign key or any other contraint in place.

TRUNCATE is also a logged operation, but in a different way.

TRUNCATE logs the deallocation of the data pages in which the data

exists. The deallocation of data pages means that your data

rows still actually exist in the data pages, but the

extents have been marked as empty for reuse. This is what

makes TRUNCATE a faster operation to perform over DELETE.

You cannot TRUNCATE a table that has any foreign key

constraints. You will have to remove the contraints, TRUNCATE the

table, and reapply the contraints.

TRUNCATE will reset any identity columns to the default seed

value. This means if you have a table with an identity column and

you have 264 rows with a seed value of 1, your last record will have

the value 264 (assuming you started with value 1) in its identity

columns. After TRUNCATEing your table, when you insert a new

record into the empty table, the identity column will have a value of

1. DELETE will not do this. In the same scenario, if you

DELETEd your rows, when inserting a new row into the empty table, the

identity column will have a value of 265

Reagrds

Raju