August 5, 2014 at 4:09 am
Hi,
Running this query on one of our servers returned some scary numbers.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECTCOUNT(*) ConvCount, S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service
FROMsys.Conversation_EndPoints CEP
INNER JOIN sys.services S
ON CEP.service_id = S.service_id
GROUP BY S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service
Results:
ConvCountNameis_initiatorStatestate_descfar_service
120,189InitiatorService1COCONVERSINGTargetService
120,189TargetService0COCONVERSINGInitiatorService
31,153,305InitiatorService1DIDISCONNECTED_INBOUNDTargetService
1InitiatorAndTargetService1COCONVERSINGInitiatorAndTargetService
1InitiatorAndTargetService0COCONVERSINGInitiatorAndTargetService
29,385InitiatorAndTargetService1DIDISCONNECTED_INBOUNDInitiatorAndTargetService
All services are in the same database. The activation procedure on the InitiatorService service has been altered to end conversations properly.
InitiatorAndTargetService uses a dialog pool base on this MSDN Blog.
To clean up I was going to end all the conversation like this:
DECLARE@sqlVarchar(8000) = '.',
@LoopCountSmallInt = 0
BEGIN TRANSACTION
WHILE LEN(@SQL) > 0 AND @LoopCount < 100
BEGIN
SET @sql = ''
SELECTTOP 115
@sql = @sql + 'END CONVERSATION ''' + CAST([conversation_handle] AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)
FROMsys.conversation_endpoints WITH (NOLOCK)
WHEREfar_service = 'RemoteTargetService'
AND[state] = 'DI'
EXEC(@SQL)
SET @LoopCount = @LoopCount + 1
END
COMMIT TRANSACTION
Code takes 15 seconds to run. This means to around 11 hours to clean it all up.
So my only practical option is:
ALTER DATABASE MyReallyImportantDatabase SET NEW_BROKER
The queues are filled via a C# application and a SQL Agent Job.
What can I do to mitigate data loss? Something like:
1) Pause C# app and Jobs,
2) Wait for existing conversations to end
3) End any stubborn conversations
4) Alter database
5) Unpause C# app and Jobs
I am not sure if the C# application can be paused and, if it can, what the effects will be. So alternative idea's are welcome.
There are no routes on this server.
Remus Rusanu[/url] answered a similar question here.
Can anyone second his claim "If everything you have is local, in the same db, then you have basically nothing to worry about."?
What kind of Event Notifications issues can I expect?
Cheers
September 9, 2014 at 8:03 am
I used ALTER DATABASE <DBName> NEW_BROKER WITH ROLLBACK IMMEDIATE
to clean up a couple of databases with 10's millions open conversations with state "DI" (Disconnected Inbound).
The biggest one only took 116 ms and the next biggest 1203 ms.
When I tried the same on another database with a few million open conversations, I had to cancel after a 60+ secs.
Although I can't be positive of this, system SPID 25 "BRKR TASK" started a long blocking chain due to the rollback.
Only by seeing changes in sys.sysProcesses.waitResource for SPID 25, could I be sure that the SPID was actually doing anything.
(Service restart narrowly avoided.)
Also these huge amounts of "Disconnected" conversations caused the TempDBs to bloat.
Looking at the properties of TempDB, all looked fine.
Size : 59356.31 MB
Space Available : 59089.55 MB
Although it looks like there is 267 MB in use, I could not shrink the file
The space available is incorrect as it is still reserved by Internal Objects.
As these results from Michael Valentine Jones's query shows.
VersionStore%UserObject%InternalObject%FreeSpace%TotalPagesSizeMBs
0.0030.00996.5223.4666051059259092.38
After the blocking chain issue, I have decided to follow Jacob Sebastians's advice and clean up RBAR style.
Remus Rusanu[/url] provides some great info on how to prevent conversation leaks.
If you need to use NEW_BROKER, try to stop the queue's from being filled (Blocking a firewall port worked for us on one of the queues) then wait for them to empty
or at least do this when the service is as quiet as possible and you have time to rollback if needed.
If multiple databases need doing, take your time. Do them one by one and check if the coast is clear before moving on to the next database.
September 11, 2014 at 8:19 am
I ended up using the code below to end the conversations.
Cicra 3.5 million conversations per 30 mins.
You might have to rinse and repeat to deal with all the conversations.
SET NOCOUNT ON
IF OBJECT_ID('TempDB..#Convs') IS NOT NULL
DROP TABLE #Convs
CREATE TABLE #Convs
(
ConvIDInt IDENTITY PRIMARY KEY,
ConvHndlUniqueIdentifier,
ProcessedBit
)
INSERT#Convs
SELECT[Conversation_Handle], 0
FROMsys.conversation_endpoints CE WITH (NOLOCK)
WHERE[State] IN ('DI', 'DO')
DECLARE@sqlVarchar(8000) ='.',
@LoopCountSmallInt = 0,
@MaxLoopsSmallInt = 10000,
@LoopTimerDateTime,
@StartRowInt
WHILE EXISTS
(
SELECTTOP 1 1
FROM#Convs
WHEREProcessed = 0
)
BEGIN
SET @StartRow = (SELECT ISNULL(MAX(ConvID), 0) FROM #Convs WHERE Processed = 1)
SET @LoopTimer = GETDATE()
SET @LoopCount = 0
BEGIN TRANSACTION
WHILE LEN(@SQL) > 0 AND @LoopCount < @MaxLoops
BEGIN
SET @LoopCount = @LoopCount + 1
SET @sql = ''
SELECT@sql = @sql + 'END CONVERSATION ''' + CAST(ConvHndl AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)
FROM#Convs
WHEREProcessed = 0
AND ConvID BETWEEN @StartRow AND @StartRow + 100 -- 100 rows at a time. 100 * @MaxLoops = commit size
BEGIN TRY
EXEC(@SQL)
UPDATE#Convs
SETProcessed = 1
WHEREConvID BETWEEN @StartRow AND @StartRow + 100
END TRY
BEGIN CATCH
PRINT ERROR_MESSAGE()
END CATCH
-- Doesn't matter if there are errors or not, continue anyway.
SET @StartRow = @StartRow + 100
END
COMMIT TRANSACTION
-- For our viewing pleasure and to get an idea of how long it will take to complete.
PRINT 'Cicra ' + CAST((@LoopCount * 100) AS Varchar(7)) + ' conversations ended in ' + CAST(DATEDIFF(S, @LoopTimer, GETDATE()) AS Varchar(10)) + ' seconds.'
END
September 11, 2014 at 8:52 am
Still dealing with the bloated TempDBs : http://www.sqlservercentral.com/Forums/Topic1607443-1550-1.aspx?Update=1
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply