January 24, 2014 at 9:40 am
*** UPDATE - Issue identified and probably solved. See my next post below. ***
Hi all,
One of our front-end developers brought me a situation that is baffling me. I'll try to explain briefly. He has written an app where one thread calls a stored proc that assigns a value to a uniqueidentifier variable; inserts into a table a row consisting of that variable value, some parameter values, and a default NULL in a datetime column called ScheduledOn; and selects the uniqueidentifier variable to return that value to the app. This thread then triggers an event, and a second thread, upon seeing that event, executes a stored proc that updates the same table, setting the ScheduledOn column to the current date/time where that column is null, and via an output clause, returns the values from each updated row.
Here's the issue - when this app runs, the second thread that calls the update-output proc doesn't return any rows, even though the first thread just inserted one that meets the WHERE clause conditions. If that second thread calls the update-output proc a second later, it returns the row inserted by the first thread. Each thread is running on a separate connection, so each proc runs on a separate SPID, and nothing else is touching this table at the moment.
In summary:
Thread 1 executes the insert app and reads the returned value.
Thread 1 triggers an event.
Thread 2 sees the event and runs the update-output proc, but gets no rows even though Thread 1 just inserted one.
Thread 2 runs the update-output proc a second or so later and gets the row that Thread 1 inserted.
When I execute the procs as part of the same batch on the same connection/session in SSMS, they perform as expected - the first proc inserts the row and returns the uniqueidentifier variable value and the second proc updates that row and outputs the row values.
The table has no triggers, one clustered index, and one nonclustered index. The cached execution plans for the stored procs show that the insert proc performs a clustered index insert and that the update-output proc performs one clustered index scan and one clustered index seek. I can post DDL and execution plans if anyone would like to see them.
The server runs at the READ_COMMITTED_SNAPSHOT isolation level. Here's the server version info:
Microsoft SQL Server 2008 (SP3) - 10.0.5835.0 (X64)
Mar 1 2013 22:17:38
Copyright (c) 1988-2008 Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
I have run an extended events session to capture the rpc_starting, rpc_completed, sql_statement_starting, sql_statement_completed, sp_statement_starting, and sp_statement_completed events for this developer's nt_username (he's working from his local machine and connects with his domain login). I see what I would expect -
An rpc_starting event for session_id 1 at 00:00:00.001,
a series of sp_statement_starting and sp_statement_completed event pairs for session_id 1 for object_id 1 (the insert proc) at 00:00:00.001,
an rpc_completed event for session_id 1 at 00:00:00.001,
an rpc_starting event for session_id 2 at 00:00:00.100,
a series of sp_statement_starting and sp_statement_completed event pairs for session_id 2 for object_id 2 (the UxPDATE-OUTPUT proc) at 00:00:00.100,
an rpc_completed event for session_id 2 at 00:00:00.100,
in that order. Each proc generally runs in under a millisecond, and there's usually about 100-250 milliseconds between the rpc_completed event for the first proc and the rpc_starting event for the second proc. I can post the XML output of the XE session, too, if anyone wants to see that.
The fact that the update-output proc doesn't return the row just inserted by the insert proc when run immediately after but does return it if run a second later, if in fact that is happening, seems like a major ACID violation.
Questions:
1. Why wouldn't the update-output proc return the row inserted by the insert proc a fraction of a second before?
2. Am I interpreting the XE session output correctly in concluding that session_id 1 completed and committed the insert before session_id 2 initiated the update-output operation?
2. Are there any other metrics or data I could collect or review that would help me understand what's happening here?
Thanks for taking the time to read this and for any help you might provide!
Jason Wolfkill
January 24, 2014 at 12:21 pm
RPC calls are not the same as actual TSQL statements or procedures run directly against SQL server. They run inside of of the RPC server process, so just because thread1 returned from the RPC call does not mean that the actual SQL procedure or statement has completed on SQL server. You can see you got an rpc completed evet at 00:00:00.001, and the statement completed at 00:00:00.100.
I would like to see the programmer's code. Sounds like he is using ActiveX or OLEDb interface instead Native client or .NET SQL provider. As an afterthought - you said the first procedure does a select to return the uniqueidentifier to the client, so is that value actually being obtained by thread1 before he generates the event to wake up the second thread?
Finally, whatever he is trying to do sounds like it might be a cleaner implementation using service broker instead of trying to send events between different threads.
The probability of survival is inversely proportional to the angle of arrival.
January 24, 2014 at 1:46 pm
Well, problem identified and solved. I re-ran the XE session to catch the database_transaction_begin and database_transaction_end events. It turns out that even though the insert statements were completing before the update-output statements began, the transaction for the update-output proc was beginning before the transaction for the insert ended, which is consistent with the results we were observing. Since the DB is running at the READ_COMMITTED_SNAPSHOT isolation level, the update-output proc would not see the inserted row.
Explaining my proposed solution to this issue requires me to confess that I didn't fully explain the update-output proc. In that proc, there's a CTE that does some row numbering and other logic to identify which rows will be updated, and that CTE is joined back to the table in the UPDATE statement. If I understand correctly, adding the READCOMMITTEDLOCK table hint to the table reference in the CTE will require it to obtain shared locks on rows that it reads, forcing it to wait until the insert transaction releases its exclusive locks before it can complete. If I'm right about that, the update-output proc will not complete until it can read the rows inserted by the insert proc, meaning that the unexpected behavior will not occur. Any thoughts or suggestions about that?
BTW, @sturner, I verified that the developer is using the .NET SQL provider - his quote was "I may be masochistic, but I'm not crazy."
Jason Wolfkill
January 24, 2014 at 2:02 pm
Its obvious that the event is being sent to the second thread before the transaction (first insert) has committed, regardless of what is going on in your second procedure.
I will just reiterate my final comment: Service Broker is a better solution for this situation, ie close coupled transaction synchronization between different threads. The scenario would be that thread1 would do its insert then write the uniqueidentifer result to a broker queue that thread2 (or multiple threads) are waiting on - it wakes up when the message is written and so it can proceed to do its further processing. This eliminates the need for thread1 to send events to other threads and efficiently puts thread2 (and multiple threads) asleep when there is nothing to do.
The probability of survival is inversely proportional to the angle of arrival.
January 24, 2014 at 2:30 pm
sturner (1/24/2014)
Its obvious that the event is being sent to the second thread before the transaction (first insert) has committed, regardless of what is going on in your second procedure.
Right - I verified that although the first proc's statements were completed before the second proc's statements executed, the first proc's *transaction* hadn't ended before the second proc's transaction began.
sturner (1/24/2014)
I will just reiterate my final comment: Service Broker is a better solution for this situation, ie close coupled transaction synchronization between different threads. The scenario would be that thread1 would do its insert then write the uniqueidentifer result to a broker queue that thread2 (or multiple threads) are waiting on - it wakes up when the message is written and so it can proceed to do its further processing. This eliminates the need for thread1 to send events to other threads and efficiently puts thread2 (and multiple threads) asleep when there is nothing to do.
Thanks for the suggestion. I did raise the service broker possibility earlier. This is a majorly asynchronous process, and Thread2 initiates quite a bit of work outside the database. It will eventually pick up all rows inserted into the table even if it doesn't catch the most recently inserted one and even an hour or two delay will not be a problem. The developers have so much of that architecture built already, and frankly, I'd rather not accumulate any more responsibility for this project than I already have!
Jason Wolfkill
January 28, 2014 at 8:15 am
wolfkillj (1/24/2014)
The developers have so much of that architecture built already, and frankly, I'd rather not accumulate any more responsibility for this project than I already have!
LOL Understood - I deal with the same issues with the developers here. Its usually when things fail to work as they intended that they come back and ask for advice.
The probability of survival is inversely proportional to the angle of arrival.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply