October 22, 2018 at 8:53 am
I'm working on a system which stores what you might call "temporary working data" in a number of database tables. To manage concurrent calculations, these all have a column to record the @@SPID of the connection - this is used to ensure that each connection only uses its own data.
At various points in the calculation (specifically, at the very start before anything's done, and at the end after everything's done), we want to clear a particular table of any and all data for a specific @@SPID. At the moment, we're doing this:
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (NoLock) Where (SPID = @@SPID))
Delete XXX
From dbo.TTemp_MaintainActuals XXX With (ForceSeek)
Where (XXX.SPID = @@SPID)
That works just fine but with an eye on performance, and bearing in mind that these tables could contain a reasonably large (although not massive) number of rows, I'd like to do a Truncate instead of a Delete where possible. The "where possible" bit obviously means where the table doesn't contain any rows from other connections.
To make this as multi-thread safe / bulletproof as possible, I'm thinking of the following sort of approach:
Declare @Truncate Bit = 1
If (@@TranCount = 0)
Set @Truncate = 0
Else
Begin
Set Lock_Timeout 0
Begin Try
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (TabLock, HoldLock) Where (SPID <> @@SPID))
Set @Truncate = 0
End Try
Begin Catch
Set @Truncate = 0
End Catch
End
If (@Truncate = 0)
Begin
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (NoLock) Where (SPID = @@SPID))
Delete XXX
From dbo.TTemp_MaintainActuals XXX With (ForceSeek)
Where (XXX.SPID = @@SPID)
End
Else
Truncate Table dbo.TTemp_MaintainActuals
The aim of the "If Exists (Select 1 From dbo.TTemp_MaintainActuals With (TabLock, HoldLock) Where (SPID <> @@SPID))" is twofold:
1) See if there are any rows belonging to other connections.
2) Get a lock on the table for the duration of the proposed truncation.
The problem is that at the end of the code, the connection still has a lock on the whole table, which is only relinquished when its transaction is committed or rolled back. And that is a problem for other connections wanting to make use of it.
I can't find any other way of releasing the lock - does anybody know of one?
October 22, 2018 at 9:14 am
May I ask what may be a dumb question? Why aren't each of the processes using their own temporary table so that you aren't sharing a table among multiple processes for these particular calculations?
October 22, 2018 at 10:04 am
Excellent question! And one we have considered. We found that creating temporary tables for each process was significantly slower than using permanent ones.
October 22, 2018 at 10:09 am
julian.fletcher - Monday, October 22, 2018 10:04 AMExcellent question! And one we have considered. We found that creating temporary tables for each process was significantly slower than using permanent ones.
Care to expand on this? How were the tables being created? Were you creating # temporary tables or unique permanent tables for each process?
October 22, 2018 at 10:23 am
We were making # tables. We have considered making permanent tables for each process (e.g. TTemp_MaintainActuals1, TTemp_MaintainActuals2, TTemp_MaintainActuals3, TTemp_MaintainActuals4 ... TTemp_MaintainActuals32767) but that would require quite a few new problems to be resolved.
October 22, 2018 at 11:21 am
Just delete the rows. If it's not that many rows, it's more work than it's worth to try to TRUNCATE.
I suspect (hope) the temp table is clustered first on SPID. If it's not, do that. But other than that, you should be OK.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 22, 2018 at 11:52 am
ScottPletcher - Monday, October 22, 2018 11:21 AMJust delete the rows. If it's not that many rows, it's more work than it's worth to try to TRUNCATE.I suspect (hope) the temp table is clustered first on SPID. If it's not, do that. But other than that, you should be OK.
Ah, be careful now... clustering on the SPID will create silos that are inserted into "out of order" which will cause massive page splits and huge amounts of log file usage even in the SIMPLE recovery model unless you can convert it to SPID followed by some form of ever-increasing column. Even if you have that, if you also have "Expansive" updates, insert and update performance is going to tank compared to what could be and log file activity will rise to about 43 times as much as it needs to be. You'll also end up with incredibly low page density within the silos, which is where most of the work will be done which also wastes a ton of memory on mostly emptied pages.
I can't upload an image from where I'm at right now. I'll do it when I get home but you'll be shocked at what such an index looks like under such conditions.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 22, 2018 at 12:28 pm
julian.fletcher - Monday, October 22, 2018 10:23 AMWe were making # tables. We have considered making permanent tables for each process (e.g. TTemp_MaintainActuals1, TTemp_MaintainActuals2, TTemp_MaintainActuals3, TTemp_MaintainActuals4 ... TTemp_MaintainActuals32767) but that would require quite a few new problems to be resolved.
So why was the creating of the # tables slow?
October 22, 2018 at 12:48 pm
julian.fletcher - Monday, October 22, 2018 8:53 AMI'm working on a system which stores what you might call "temporary working data" in a number of database tables. To manage concurrent calculations, these all have a column to record the @@SPID of the connection - this is used to ensure that each connection only uses its own data.
At various points in the calculation (specifically, at the very start before anything's done, and at the end after everything's done), we want to clear a particular table of any and all data for a specific @@SPID. At the moment, we're doing this:
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (NoLock) Where (SPID = @@SPID))
Delete XXX
From dbo.TTemp_MaintainActuals XXX With (ForceSeek)
Where (XXX.SPID = @@SPID)That works just fine but with an eye on performance, and bearing in mind that these tables could contain a reasonably large (although not massive) number of rows, I'd like to do a Truncate instead of a Delete where possible. The "where possible" bit obviously means where the table doesn't contain any rows from other connections.
To make this as multi-thread safe / bulletproof as possible, I'm thinking of the following sort of approach:
Declare @Truncate Bit = 1If (@@TranCount = 0)
Set @Truncate = 0
Else
Begin
Set Lock_Timeout 0Begin Try
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (TabLock, HoldLock) Where (SPID <> @@SPID))
Set @Truncate = 0
End Try
Begin Catch
Set @Truncate = 0
End Catch
EndIf (@Truncate = 0)
Begin
If Exists (Select 1 From dbo.TTemp_MaintainActuals With (NoLock) Where (SPID = @@SPID))
Delete XXX
From dbo.TTemp_MaintainActuals XXX With (ForceSeek)
Where (XXX.SPID = @@SPID)
End
Else
Truncate Table dbo.TTemp_MaintainActualsThe aim of the "If Exists (Select 1 From dbo.TTemp_MaintainActuals With (TabLock, HoldLock) Where (SPID <> @@SPID))" is twofold:
1) See if there are any rows belonging to other connections.
2) Get a lock on the table for the duration of the proposed truncation.The problem is that at the end of the code, the connection still has a lock on the whole table, which is only relinquished when its transaction is committed or rolled back. And that is a problem for other connections wanting to make use of it.
I can't find any other way of releasing the lock - does anybody know of one?
you can use the TOP operator and delete by smaller chunks, something like this:DECLARE @batchSize INT
DECLARE @results INT
SET @results = 1 --stores the row count after each successful batch
SET @batchSize = 1000
WHILE (@results > 0)
BEGIN
Delete top(@batchSize)
From dbo.TTemp_MaintainActuals XXX With (ForceSeek)
Where (XXX.SPID = @@SPID)
-- very important to obtain the latest rowcountrowcount to avoid infinite loops
SET @results = @@ROWCOUNT
END
Of course you have to test it first, but that could be the general idea, with this method, you can reduce table locks, and reduce log usage. If indexes are good enough also the delete time will be reduced...
MCSE, MCSA SQL Server Database Developer/Administrator
October 22, 2018 at 1:43 pm
Jeff Moden - Monday, October 22, 2018 11:52 AMScottPletcher - Monday, October 22, 2018 11:21 AMJust delete the rows. If it's not that many rows, it's more work than it's worth to try to TRUNCATE.I suspect (hope) the temp table is clustered first on SPID. If it's not, do that. But other than that, you should be OK.
Ah, be careful now... clustering on the SPID will create silos that are inserted into "out of order" which will cause massive page splits and huge amounts of log file usage even in the SIMPLE recovery model unless you can convert it to SPID followed by some form of ever-increasing column. Even if you have that, if you also have "Expansive" updates, insert and update performance is going to tank compared to what could be and log file activity will rise to about 43 times as much as it needs to be. You'll also end up with incredibly low page density within the silos, which is where most of the work will be done which also wastes a ton of memory on mostly emptied pages.
I can't upload an image from where I'm at right now. I'll do it when I get home but you'll be shocked at what such an index looks like under such conditions.
Presumably not, if the SPIDs are deleted relatively quickly from the work table. If it somehow does become an issue, just inflate the row so that only 1 or 2 fit per page. That should address the problem too. It's still less eternal maintenance than trying to figure out when you can TRUNC and when you can't.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 22, 2018 at 1:45 pm
Lynn Pettis - Monday, October 22, 2018 12:28 PMjulian.fletcher - Monday, October 22, 2018 10:23 AMWe were making # tables. We have considered making permanent tables for each process (e.g. TTemp_MaintainActuals1, TTemp_MaintainActuals2, TTemp_MaintainActuals3, TTemp_MaintainActuals4 ... TTemp_MaintainActuals32767) but that would require quite a few new problems to be resolved.So why was the creating of the # tables slow?
Perhaps because they are doing SELECT ... INTO #... a lot. That seems to create internal locks on the sys tables. I thought MS had claimed to have lessened the issues with this, but we're on SQL 2016 and we still see it in legacy code where we haven't re-worked the SELECT INTOs.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 22, 2018 at 1:51 pm
Well, with a permanent table already created they aren't using SELECT ... INTO #<table>. Why not create the #<table> then insert the data into it?
October 22, 2018 at 2:05 pm
Lynn Pettis - Monday, October 22, 2018 1:51 PMWell, with a permanent table already created they aren't using SELECT ... INTO #<table>. Why not create the #<table> then insert the data into it?
When I said "perhaps because" I meant "maybe" they are using SELECT ... INTO, I don't know for sure. But I know that can cause delays, sometimes severe delays, in using metadata, including table creation and deletion, and even in tempdb. It's convenient to code, and a lot of people don't realize the potential issues they can see from it.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 22, 2018 at 2:06 pm
Whatever, dude.
October 22, 2018 at 3:48 pm
ScottPletcher - Monday, October 22, 2018 2:05 PMLynn Pettis - Monday, October 22, 2018 1:51 PMWell, with a permanent table already created they aren't using SELECT ... INTO #<table>. Why not create the #<table> then insert the data into it?When I said "perhaps because" I meant "maybe" they are using SELECT ... INTO, I don't know for sure. But I know that can cause delays, sometimes severe delays, in using metadata, including table creation and deletion, and even in tempdb. It's convenient to code, and a lot of people don't realize the potential issues they can see from it.
Do you have any demonstrable/repeatable examples? I've previously tested overlapping SELECT/INTOs and other ways of creating tables that started after the SELECT/INTO and finished before the SELECT/INTO. The only thing I've seen it be a PITA on is things like the Explorer window in SSMS. And, no... not saying it doesn't happen. I've just not seen it (interference/delays with table creation... not stuff in EM or SSMS) happen since SQL Server 6.5 RTM before the hot fix that kept it from crippling TempDB.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply