Serivce Broker 31+ million Disconnected_Inbound conversations

Question

Serivce Broker 31+ million Disconnected_Inbound conversations

DennisPost

SSCrazy

Points: 2691
More actions
August 5, 2014 at 4:09 am

#294013

Hi,
Running this query on one of our servers returned some scary numbers.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECTCOUNT(*) ConvCount, S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service
FROMsys.Conversation_EndPoints CEP
INNER JOIN sys.services S
ON CEP.service_id = S.service_id
GROUP BY S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service
Results:
ConvCountNameis_initiatorStatestate_descfar_service
120,189InitiatorService1COCONVERSINGTargetService
120,189TargetService0COCONVERSINGInitiatorService
31,153,305InitiatorService1DIDISCONNECTED_INBOUNDTargetService
1InitiatorAndTargetService1COCONVERSINGInitiatorAndTargetService
1InitiatorAndTargetService0COCONVERSINGInitiatorAndTargetService
29,385InitiatorAndTargetService1DIDISCONNECTED_INBOUNDInitiatorAndTargetService
All services are in the same database. The activation procedure on the InitiatorService service has been altered to end conversations properly.
InitiatorAndTargetService uses a dialog pool base on this MSDN Blog.
To clean up I was going to end all the conversation like this:
DECLARE@sqlVarchar(8000) = '.',
@LoopCountSmallInt = 0
BEGIN TRANSACTION
WHILE LEN(@SQL) > 0 AND @LoopCount < 100
BEGIN
SET @sql = ''
SELECTTOP 115
@sql = @sql + 'END CONVERSATION ''' + CAST([conversation_handle] AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)
FROMsys.conversation_endpoints WITH (NOLOCK)
WHEREfar_service = 'RemoteTargetService'
AND[state] = 'DI'
EXEC(@SQL)
SET @LoopCount = @LoopCount + 1
END
COMMIT TRANSACTION
Code takes 15 seconds to run. This means to around 11 hours to clean it all up.
So my only practical option is:
ALTER DATABASE MyReallyImportantDatabase SET NEW_BROKER
The queues are filled via a C# application and a SQL Agent Job.
What can I do to mitigate data loss? Something like:
1) Pause C# app and Jobs,
2) Wait for existing conversations to end
3) End any stubborn conversations
4) Alter database
5) Unpause C# app and Jobs
I am not sure if the C# application can be paused and, if it can, what the effects will be. So alternative idea's are welcome.
There are no routes on this server.
Remus Rusanu[/url] answered a similar question here.
Can anyone second his claim "If everything you have is local, in the same db, then you have basically nothing to worry about."?
What kind of Event Notifications issues can I expect?
Cheers

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

DennisPost SSCrazy Points: 2691 More actions · Answer 1

I used ALTER DATABASE <DBName> NEW_BROKER WITH ROLLBACK IMMEDIATE to clean up a couple of databases with 10's millions open conversations with state "DI" (Disconnected Inbound).

The biggest one only took 116 ms and the next biggest 1203 ms.

When I tried the same on another database with a few million open conversations, I had to cancel after a 60+ secs.

Although I can't be positive of this, system SPID 25 "BRKR TASK" started a long blocking chain due to the rollback.

Only by seeing changes in sys.sysProcesses.waitResource for SPID 25, could I be sure that the SPID was actually doing anything.

(Service restart narrowly avoided.)

Also these huge amounts of "Disconnected" conversations caused the TempDBs to bloat.

Looking at the properties of TempDB, all looked fine.

Size : 59356.31 MB

Space Available : 59089.55 MB

Although it looks like there is 267 MB in use, I could not shrink the file

The space available is incorrect as it is still reserved by Internal Objects.

As these results from Michael Valentine Jones's query shows.

VersionStore%UserObject%InternalObject%FreeSpace%TotalPagesSizeMBs

0.0030.00996.5223.4666051059259092.38

After the blocking chain issue, I have decided to follow Jacob Sebastians's advice and clean up RBAR style.

Remus Rusanu[/url] provides some great info on how to prevent conversation leaks.

If you need to use NEW_BROKER, try to stop the queue's from being filled (Blocking a firewall port worked for us on one of the queues) then wait for them to empty

or at least do this when the service is as quiet as possible and you have time to rollback if needed.

If multiple databases need doing, take your time. Do them one by one and check if the coast is clear before moving on to the next database.

DennisPost SSCrazy Points: 2691 More actions · Answer 2

I ended up using the code below to end the conversations.

Cicra 3.5 million conversations per 30 mins.

You might have to rinse and repeat to deal with all the conversations.

SET NOCOUNT ON

IF OBJECT_ID('TempDB..#Convs') IS NOT NULL

DROP TABLE #Convs

CREATE TABLE #Convs

(

ConvIDInt IDENTITY PRIMARY KEY,

ConvHndlUniqueIdentifier,

ProcessedBit

)

INSERT#Convs

SELECT[Conversation_Handle], 0

FROMsys.conversation_endpoints CE WITH (NOLOCK)

WHERE[State] IN ('DI', 'DO')

DECLARE@sqlVarchar(8000) ='.',

@LoopCountSmallInt = 0,

@MaxLoopsSmallInt = 10000,

@LoopTimerDateTime,

@StartRowInt

WHILE EXISTS

(

SELECTTOP 1 1

FROM#Convs

WHEREProcessed = 0

)

BEGIN

SET @StartRow = (SELECT ISNULL(MAX(ConvID), 0) FROM #Convs WHERE Processed = 1)

SET @LoopTimer = GETDATE()

SET @LoopCount = 0

BEGIN TRANSACTION

WHILE LEN(@SQL) > 0 AND @LoopCount < @MaxLoops

BEGIN

SET @LoopCount = @LoopCount + 1

SET @sql = ''

SELECT@sql = @sql + 'END CONVERSATION ''' + CAST(ConvHndl AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)

FROM#Convs

WHEREProcessed = 0

AND ConvID BETWEEN @StartRow AND @StartRow + 100 -- 100 rows at a time. 100 * @MaxLoops = commit size

BEGIN TRY

EXEC(@SQL)

UPDATE#Convs

SETProcessed = 1

WHEREConvID BETWEEN @StartRow AND @StartRow + 100

END TRY

BEGIN CATCH

PRINT ERROR_MESSAGE()

END CATCH

-- Doesn't matter if there are errors or not, continue anyway.

SET @StartRow = @StartRow + 100

END

COMMIT TRANSACTION

-- For our viewing pleasure and to get an idea of how long it will take to complete.

PRINT 'Cicra ' + CAST((@LoopCount * 100) AS Varchar(7)) + ' conversations ended in ' + CAST(DATEDIFF(S, @LoopTimer, GETDATE()) AS Varchar(10)) + ' seconds.'

END

DennisPost SSCrazy Points: 2691 More actions · Answer 3

Still dealing with the bloated TempDBs : http://www.sqlservercentral.com/Forums/Topic1607443-1550-1.aspx?Update=1