August 1, 2008 at 9:12 am
Hi All
I am moving around 400K records from one live database to other on a daily basics, and then delete from this table, when this delete option is running, SQL blocks the table, i tried to delete records in batch of 100 using a while loop( know its a bad idea), but i am resulting on a dead lock situation, is there any way to delete this 400K records without affecting the live table been read or updated by application
Thanks in advance
August 1, 2008 at 9:33 am
My wild guess is you are moving some identificable chunck of data, like a month or something like that. Please correct me if I'm wrong.
If this is the case I would Range Partition the table in a way I have to "move" a single partition a day, then...
1- Move the data
2- Truncate the partition -Switch and Merge using SS2K5 terminology
Done!
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 1, 2008 at 10:04 am
I haven't done table partition, but i guess this cant be done on this case, since they are moved according to certain range flag say 1 to 10, i cant store this flag on certain partition since this will affect performance. is there any way i can perform this on partition without compersing performance.
August 1, 2008 at 10:30 am
I suppose the value of that Flag column changes over time, isn't it?... heavy updating?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 1, 2008 at 11:01 am
exactly , it changes every second, it is heavy updating, thats the problem i am facing
August 1, 2008 at 2:19 pm
I may just be grasping at straws but could you change the isolation level? Allow dirty reads (/updates possibly deletes)? Or could you add a secondary id column to use for the deletes which then the deletes will be based upon that secondary column that is not being updated?
August 1, 2008 at 3:33 pm
I don't know if this will work for your situation, but I have implemented something in a similar using TOP.
I created a job to move records slowly from one table to another. First, it does the insert into the new table, capturing the PK for the records using OUTPUT similar to this:
INSERT INTO X
OUTPUT INSERTED.PK INTO @temptable
SELECT TOP (number) Table.*
.....
Then I do a delete from the live table, joining back to the @temptable to make sure I get the right records. I didn't put all the code in, but enough to hopefully give you an idea of how it works.
I schedule it to run during light times with short breaks in between so that it doesn't lock too long and allow other processes to run simultaneously. The TOP limits the number of records that are done, but doesn't necessarily guarantee that the records are moved in any particular order. But I only have to set up the criteria once and let 'er go and eventually it will get all the records that need to be moved.
Hope this helps!
Chad
August 2, 2008 at 10:13 am
Sometimes desperate situations require desperate solutions.
Here is a crazy idea.
Imagine you have to archve and purge all rows where flag=1 while you have to keep rows where flag has any value from 2 to 10.
1- Create two tables
Table TABLE_FLAG_1 and TABLE_FLAG_2_10
2- Put all your data in TABLE_FLAG_2_10
3- Create a view YOURVIEW that exposes TABLE_FLAG_1 union TABLE_FLAG_2_10
Set your application to point to YOURVIEW
4- When you are updating rows and updated flag value goes from 2 to 10 do and update.
When you are updating rows and updated flag value is 1 do an insert on TABLE_FLAG_1 and delete the old row -this provides you with a pre-archiving process while all rows are still visible to the application thru YOURVIEW
5- Once a day move the content of TABLE_FLAG_1 to whatever place you are moving it today, then truncate TABLE_FLAG_1
Voila!
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 2, 2008 at 11:26 am
The loop idea isn't actually a bad one (I refer to it as a "DELETE CRAWLER"), but deletes, by themselves, run pretty quickly depending on the criteria. I'd limit the deletes to 25k rows (about a second of duration) at a time and leave 5 or 10 seconds between deletes for the system to do other things.
The key to avoiding deadlocks with some effectivety IS, in fact, the isolation level being used... but it's not what most folks would expect... Lock the whole table with TABLOCKX for the delete... the other processes will simply wait instead of deadlocking.
Here's an example of the delete crawler I use... it comes complete with a "test data generator" so that you can "play" to see what works best on your box...
--===== If the test table exists, drop it
IF OBJECT_ID('dbo.JBMTestDetail','U') IS NOT NULL
DROP TABLE dbo.JBMTestDetail
GO
--===== Create and populate a 1,000,000 row test table.
-- Column "ConnID" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)
-- Column "Key1" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "Time_Stamp" has a range of >=01/01/2005 and <01/01/2015 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
SELECT TOP 1000000
ConnID = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(30)),''), --(10 rows per connection)
Key1 = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(20)),''), --just to test index with
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
Time_Stamp = ISNULL(CAST(RAND(CHECKSUM(NEWID()))*3652.0+38351.0 AS DATETIME),0),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTestDetail
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
--===== Create indexes similar to Troy's
CREATE CLUSTERED INDEX IXC_JBMTestDetail_Time_Stamp_ConnID ON dbo.JBMTestDetail (Time_Stamp,ConnID)
CREATE NONCLUSTERED INDEX IX_JBMTestDetail_ConnID_Key1 ON dbo.JBMTestDetail (ConnID,Key1)
GO
--===== Setup to measure performance...
SET STATISTICS TIME ON
--===== Define the cutoff date with a time of "midnight" or, if you will,
-- define the cutoff date with no time so we only delete whole days.
DECLARE @CutoffDate DATETIME
SELECT @CutoffDate = DATEADD(dd,DATEDIFF(dd,0,GETDATE()),-94)
--===== Limit all further queries, including deletes, to 25,000 rows
-- (about 1 second worth of deletes, like I said before)
SET ROWCOUNT 25000
--===== See if any rows qualify for deletion. If even just one exists,
-- then there's work to do and @@ROWCOUNT will be > 0.
-- DO NOT PUT ANY CODE BETWEEN THIS SELECT AND THE WHILE LOOP OR
-- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP
SELECT TOP 1 1 FROM dbo.JBMTestDetail WHERE Time_Stamp < @CutoffDate
--===== If the rowcount from the above is greater than 0,
-- then delete 25,000 rows at a time until there's nothing
-- left to delete
WHILE @@ROWCOUNT > 0
BEGIN
--===== Just a "marker" to separate the loop in the output
PRINT REPLICATE('=',78)
--===== This delay gives other processes breathing room
WAITFOR DELAY '00:00:10'
--===== Do the delete. Will be limited by the SET ROWCOUNT above.
-- DO NOT PUT ANY CODE BETWEEN THIS DELETE AND THE "END" OR
-- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP.
DELETE dbo.JBMTestDetail WITH (TABLOCKX)
WHERE Time_Stamp < @CutoffDate
END
--===== Restore the ability to process more than 25,000 rows
SET ROWCOUNT 0
SELECT 1000000 - COUNT(*) FROM jbmtestdetail
--Jeff Moden
Change is inevitable... Change for the better is not.
August 4, 2008 at 4:23 am
Hi Jeff
This is a nice way, but i cant block the table for this time, since the table is accessed by application via many threads, once the table has been blocked or not avaliable, the thread waits for a certain time and thus causing a delay in process(the applicaiton is designed in such a way that it waits for some time and retires again). i am thinking of moving a small amount of data for every half hour into the other table and thus reducing the load on the database, hope this helps, is there any other way????
Thanks for all your posts,:)
August 4, 2008 at 4:46 am
why aren't use the table -nolock option.
its not applicable for delete option. but u can use it in application proc, when you are selecting records(select option).
i think its avoid the table lock.
August 4, 2008 at 4:51 am
August 4, 2008 at 5:33 am
CrazyMan (8/4/2008)
Hi JeffThis is a nice way, but i cant block the table for this time, since the table is accessed by application via many threads, once the table has been blocked or not avaliable, the thread waits for a certain time and thus causing a delay in process(the applicaiton is designed in such a way that it waits for some time and retires again). i am thinking of moving a small amount of data for every half hour into the other table and thus reducing the load on the database, hope this helps, is there any other way????
Thanks for all your posts,:)
You can change the length of the deletes to a much smaller number of rows per delete so the lock doesn't take as long.
Of course, if you did it during a quiet time, it should only take 16-30 seconds to delete all 400k rows in a single lock.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 4, 2008 at 6:12 am
Thanks Jeff
Its supposed to take only a few seconds, since the table is continusly accessed by applicaitons, this is taking me 3 minutes to delete on a Quite time, i tried on a busy time and this took me 5 minutes to delete 24K rows and the table has been locked for around 2 minutes, i used isolation level lock as well, (READ UNCOMMITED), still it blocks, so now i cant use this to delete smaller number of rows, i am about to try loop as u recommended after 30 minutes, with the same number of rows, application gets busy in late afternoon.
Cheers
🙂
August 4, 2008 at 6:14 am
Hi Jaykay
I am using nolock to access data, and then store this on a temp table, index the temp table as same as the master table and then delete rows from the master table by join.
🙂
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply