June 23, 2014 at 2:44 am
Hi Experts,
something happened incorrectly in our DB that allowed some of the data in products table to be doubled / tripled. This led to other two huge fact tables doubled/tripled for these products. Each of this this fact table has 1 billion records.
I am trying to delete out the records for Friday, Saturday and Sunday from the fact tables. We have had this issue before and it took a very long time to delete out the records. Currently I am trying to delete only the data from Saturday with the query below and so far it has taken 1hr 10 minutes.
delete from fact_table where date_id = xxxx
Please can you advise on what can be doe to speed this up?
June 23, 2014 at 3:14 am
- make sure you have the proper index (on date_id), so the query could find the records quickly.
- delete in batches of 100.000 - 500.000 records (i.e. DELETE TOP (100000) FROM ...)
- if the amount of data to be deleted is much more compared to the data that will remain, you could:
>> 1. copy the data that will remain into a temporary table
>> 2. truncate the original table
>> 3. move the data from the temporary table back into the original table
June 23, 2014 at 3:49 am
many thanks, table is quite big not sure how do i go about moving data into temp table and then back to original table. However i will try the batch delete and see how it goes.
Any advise on how do i go about such things in future?
June 23, 2014 at 4:01 am
rajsin7786 (6/23/2014)
many thanks, table is quite big not sure how do i go about moving data into temp table and then back to original table.
SELECT *
INTO #temp_table
FROM {original_table}
WHERE {filter on remaining data here}
TRUNCATE TABLE {original_table}
INSERT INTO {original_table}
SELECT *
FROM #temp_table
rajsin7786 (6/23/2014)
However i will try the batch delete and see how it goes. Any advise on how do i go about such things in future?
Sure: :w00t: prevent the duplicated records in the first place :w00t:
You can do it on the front end (in the application/ETL/...) by adding additional checks, but you can also add some constraint(s) on the table(s) to require unique records and thus prevent duplicate records.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply