Sending batches of SPs at one time from Application

  • The developers I am working with are taking to the idea of stored procedures which is great. Problem is now they seem to be sending batches of stored procedures at the server at one time. I don't think this is beneficial. Here is an example of what I am getting thrown at the server

    EXEC [DBO].[GET_XXXCONTACTS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXASSIGNEDCONTACTS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXTRANSACTIONS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXCOST] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXRATE] @P_TRANSACTIONSID; EXEC [DBO].[GET_TAXRATEG] @P_TRANSACTIONSID; EXEC [DBO].[GET_PO] @P_TRANSACTIONSID;

    I have removed most of the other transactions. There are about 25-30 of them. Here are the issues I have with this -

    1) Makes performance debugging a nightmare. All transactions taking 860 micro Secs currently. I can not see what one transaction is taking.

    2) From the application side is there really a benefit to hitting the server once vs 30 times. I know on the server it is still processing 30 commands so I can not imagine a speed gain here

    3) I would think I/O would suffer trying to jam a bunch of data through the network

    Any thoughts would be appreciated

  • I agree with you, it is an ugly and not scalable way to interact with a database server. They should pull all of the rows they need relating to a given transactionID in one batch and receive it as a result set.

    I sympathize. The sad part about all of this is that the typical developer doesn't understand the power of set based processing and efficiency of relational logic. Rarely in my career have I seen a development project leverage the database architecture as a major part of a system design. Subsequently they almost always experience performance and scalability issues later on.

    All you can do is send a memo stating your concerns and recommendations and reasons why (be sure and save a copy of it for later when the finger pointing starts). Good Luck.

    The probability of survival is inversely proportional to the angle of arrival.

  • I understand what you are saying because I feel that way often...Problem is I am the chosen Gatekeeper or Architect. They will listen if I steer them in the correct direction. I just want to make sure that I fuel my argument (many to one). So I am hoping that you and others can give concrete examples where this approach fails.

  • Well here are two concrete reasons why not to do it the way they are doing it:

    1) Performance: there is a overhead associated with each batch, from the network latency of the packets being sent to the result coming back. Each transaction increases contention on the server and a fair amount of work is involved in setting up and optimizing each query (batch). It is much better to do this overhead one time for X amount of data instead X times for X amount of data.

    2) data integrity: The only way to guarantee that all the separate pieces of information relating to a given transactionID are consistent and correct is to request them and return then in a single transaction. Remember, the data can change between the time the first batch is sent and the last batch is sent. This can be the source for a very bad bugs in your application. Data integrity/consistency is job #1 for DBAs (whether developers understand the concept or not).

    The probability of survival is inversely proportional to the angle of arrival.

  • JKSQL (5/12/2011)


    The developers I am working with are taking to the idea of stored procedures which is great. Problem is now they seem to be sending batches of stored procedures at the server at one time. I don't think this is beneficial. Here is an example of what I am getting thrown at the server

    EXEC [DBO].[GET_XXXCONTACTS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXASSIGNEDCONTACTS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXTRANSACTIONS] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXCOST] @P_TRANSACTIONSID; EXEC [DBO].[GET_XXXRATE] @P_TRANSACTIONSID; EXEC [DBO].[GET_TAXRATEG] @P_TRANSACTIONSID; EXEC [DBO].[GET_PO] @P_TRANSACTIONSID;

    1) Makes performance debugging a nightmare. All transactions taking 860 micro Secs currently. I can not see what one transaction is taking.

    You ought to be able to zero in one the procedure call level within a batch. Look into Profiler events "Stored Procedures/SP:Starting" and "Stored Procedures/SP:Completed" instead of at the batch level.

    2) From the application side is there really a benefit to hitting the server once vs 30 times. I know on the server it is still processing 30 commands so I can not imagine a speed gain here

    Yes, there is. It may make it seem like troubleshooting is more difficult than necessary but performance of one batch with 30 proc calls will almost certainly be better than 30 separate batches across 30 separate calls. If you test it in your environment please let us know.

    3) I would think I/O would suffer trying to jam a bunch of data through the network

    Not really...no more than 30 separate calls. Once SQL Server has built the results it will start delivering the data to the client as fast as the client can accept it.


    I am mostly speculating since I cannot see the proc code however the alternative to doing it the way your developers are doing it could be to have one procedure that accepted a table-valued parameter (TVP) containing all the TRANSACTIONSID's they wanted information for and that procedure would return a single resultset. IMHO the batched approach they are using is not all together bad and better than some of the alternatives that are coming to mind including making 30 separate calls.

    You say the developers have taken to the idea of stored procedures...that's great!...but in this instance unless they move to a procedure that accepts a TVP I would argue that a prepared SQL statement using an IN clause might be a better option for performance and will ease troubleshooting...that said, try your best to keep them in stored procedures! What your developers have chosen is actually in the middle between stored procedure utopia and embedded SQL hell.

    What is the client language? If it's .NET then mapping ADO.NET DataTables to a TVP is a cinch and scales extremely well.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • It is .Net but the problem I have is we are forced to code to 2005 and greater. That means my TVP is a CSV. They are already using CTE's in many of these procs. to add another temp table will crush my inner being.

  • sturner (5/12/2011)


    Well here are two concrete reasons why not to do it the way they are doing it:

    1) Performance: there is a overhead associated with each batch, from the network latency of the packets being sent to the result coming back. Each transaction increases contention on the server and a fair amount of work is involved in setting up and optimizing each query (batch). It is much better to do this overhead one time for X amount of data instead X times for X amount of data.

    Currently 30 calls need to be made...whether in 1 or 30 batches, so arguing batches is pointless unless they get to a place where one resultset can be delivered with all necessary data. 1 DB call with a batch of 30 proc executions in it is still better than 30 DB calls with 1 proc execution in each batch.

    2) data integrity: The only way to guarantee that all the separate pieces of information relating to a given transactionID are consistent and correct is to request them and return then in a single transaction. Remember, the data can change between the time the first batch is sent and the last batch is sent. This can be the source for a very bad bugs in your application. Data integrity/consistency is job #1 for DBAs (whether developers understand the concept or not).

    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • JKSQL (5/12/2011)


    It is .Net but the problem I have is we are forced to code to 2005 and greater. That means my TVP is a CSV. They are already using CTE's in many of these procs. to add another temp table will crush my inner being.

    OK, you posted in a 2008 forum so I went for TVPs 😀

    Having to setup a temp table before making a proc call definitely lends itself to inner-being-crushing.

    You could use XML as an input parameter or a delimited list. It's a well-covered topic and scales, albeit not was well as the TVP.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/12/2011)


    Currently 30 calls need to be made...whether in 1 or 30 batches, so arguing batches is pointless unless they get to a place where one resultset can be delivered with all necessary data. 1 DB call with a batch of 30 proc executions in it is still better than 30 DB calls with 1 proc execution in each batch.

    The point that I was trying to make is that all of those separate queries take extra overhead -- even though the query plans may be cached. Yes, one batch of 30 stored procedure executions is better than 30 batched of 1, but still not optimal.

    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    The probability of survival is inversely proportional to the angle of arrival.

  • sturner (5/12/2011)


    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    I disagree with your assessment wholeheartedly. It's a valid way to handle the problem. There is a tradeoff between data integrity and concurrency but setting the isolation level to get the data integrity your application requires is core to what a DBMS is all about.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    I disagree with your assessment wholeheartedly. It's a valid way to handle the problem. There is a tradeoff between data integrity and concurrency but setting the isolation level to get the data integrity your application requires is core to what a DBMS is all about.

    And I sir, remain in complete disagreement with you. To set a higher isolation so you can throw a bunch of separate queries at a server when one single query (in a single stored procedure) would suffice is a poor design. It has higher overhead and reduces concurrency plain and simple.

    But good luck with that ... if you're happy with a less efficient design than who am I to argue with you.

    The probability of survival is inversely proportional to the angle of arrival.

  • sturner (5/12/2011)


    opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    I disagree with your assessment wholeheartedly. It's a valid way to handle the problem. There is a tradeoff between data integrity and concurrency but setting the isolation level to get the data integrity your application requires is core to what a DBMS is all about.

    And I sir, remain in complete disagreement with you. To set a higher isolation so you can throw a bunch of separate queries at a server when one single query (in a single stored procedure) would suffice is a poor design. It has higher overhead and reduces concurrency plain and simple.

    But good luck with that ... if you're happy with a less efficient design than who am I to argue with you.

    I take issue with your assessment as if utilizing different isolation levels to support application requirements is a hack. If that were the case then why not just run everything as SERIALIZABLE, or READ UNCOMMITTED...or...see where I am going with this? I am just working within the requirements here and utilizing the tools SQL gives us...isolation levels are defined in the SQL standard and they have definable uses...if you want to change the rules of the game that's fine, but that word hack is very provocative so expect a response when you use it incorrectly.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    I disagree with your assessment wholeheartedly. It's a valid way to handle the problem. There is a tradeoff between data integrity and concurrency but setting the isolation level to get the data integrity your application requires is core to what a DBMS is all about.

    And I sir, remain in complete disagreement with you. To set a higher isolation so you can throw a bunch of separate queries at a server when one single query (in a single stored procedure) would suffice is a poor design. It has higher overhead and reduces concurrency plain and simple.

    But good luck with that ... if you're happy with a less efficient design than who am I to argue with you.

    I take issue with your assessment as if utilizing different isolation levels to support application requirements is a hack. If that were the case then why not just run everything as SERIALIZABLE, or READ UNCOMMITTED...or...see where I am going with this? I am just working within the requirements here and utilizing the tools SQL gives us...isolation levels are defined in the SQL standard and they have definable uses...if you want to change the rules of the game that's fine, but that word hack is very provocative so expect a response when you use it incorrectly.

    You have completely diverged from the subject of this thread. One could argue that pessimistic locking is a valid solution to data consistency as well but it is not good solution for obvious reasons.

    The original poster questioned the wisdom of throwing a whole bunch of separate queries at a server to obtain consistent set of information for a given transactionID. Yes, raising the isolation level when this sort of inefficient methodology is used will insure data consistency , but at the price of lower concurrency and higher overhead. As long as all parties understand this there is no further issue to discuss.

    The probability of survival is inversely proportional to the angle of arrival.

  • sturner (5/12/2011)


    opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    sturner (5/12/2011)


    opc.three (5/12/2011)


    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    Yes, that is a hack way of masking the underlying issue, but it is less than optimal and reduces overall concurrency ... especially when this type of thing occurs a lot.

    I disagree with your assessment wholeheartedly. It's a valid way to handle the problem. There is a tradeoff between data integrity and concurrency but setting the isolation level to get the data integrity your application requires is core to what a DBMS is all about.

    And I sir, remain in complete disagreement with you. To set a higher isolation so you can throw a bunch of separate queries at a server when one single query (in a single stored procedure) would suffice is a poor design. It has higher overhead and reduces concurrency plain and simple.

    But good luck with that ... if you're happy with a less efficient design than who am I to argue with you.

    I take issue with your assessment as if utilizing different isolation levels to support application requirements is a hack. If that were the case then why not just run everything as SERIALIZABLE, or READ UNCOMMITTED...or...see where I am going with this? I am just working within the requirements here and utilizing the tools SQL gives us...isolation levels are defined in the SQL standard and they have definable uses...if you want to change the rules of the game that's fine, but that word hack is very provocative so expect a response when you use it incorrectly.

    You have completely diverged from the subject of this thread. One could argue that pessimistic locking is a valid solution to data consistency as well but it is not good solution for obvious reasons.

    The original poster questioned the wisdom of throwing a whole bunch of separate queries at a server to obtain consistent set of information for a given transactionID. Yes, raising the isolation level when this sort of inefficient methodology is used will insure data consistency , but at the price of lower concurrency and higher overhead. As long as all parties understand this there is no further issue to discuss.

    "Good" being subjective...I would have accepted using SERIALIZABLE as "not good in your opinion" or "not preferrable in your opinion" but I do not accept "hack"...and what "underlying issue" is using SERIALIZABLE masking? Sending a batch containing multiple proc calls to SQL Server is a mainstream use case.

    opc.three (5/12/2011)


    sturner (5/12/2011)


    2) data integrity: The only way to guarantee that all the separate pieces of information relating to a given transactionID are consistent and correct is to request them and return then in a single transaction. Remember, the data can change between the time the first batch is sent and the last batch is sent. This can be the source for a very bad bugs in your application. Data integrity/consistency is job #1 for DBAs (whether developers understand the concept or not).

    This can be solved by running the batch in either the SNAPSHOT or SERIALIZABLE isolation level.

    I was actually bolstering the point you originally alluded to and provided the implementation detail necessary to get there.

    At any rate, you are right, we have veered off course and the proverbial dead horse is beyond beaten.

    I think we're on the same page...sorry if I over-reacted...refer to my earlier point about the word hack for reasons behind that 😛

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Well that was a good read between you guys. Thanks for the effort. So I am looking at the performance of a few things that came to mind....so we did a small test. Problem is this test is not a production level test

    The test was on inserts and not selects

    1) hit the server 160 times which took overall time of 28 seconds. So that is what the application reported back when it said it was done. These are heavy transactions Varbinary(Max) . Files averaging 30K Server time was 4.39 seconds

    2) Bulk (Select ...UNION ALL) 160 transactions into one transaction and submit the query. App reported back 11 seconds server was 7.03 seconds

    3) Gather Procs (EXEC Proc; Exec Proc;.;.;.) send in as one statement. It actually took longer App responded at 28 seconds and the server processing the procs went from 4.39 to 7.2.

    So it seems that the server workload went up but the Application did not see the speed advantage. It just moved the workload. I think the network bandwidth is a limit in this case. i used inserts just because I was debugging something like this a while back and had code already.

    I am pretty much lost between multiple calls vs one. I believe on a select with small amounts of I/O there is benefit to making it one call. I think that will actually reduce the locks and provide cleaner data.

    When getting larger amounts of data I do not believe there is an advantage of bulk submitting.

    I am of the opinion it is a draw. Seems like I always ask that type of question. Where by the book says do individual but with network issues and speed it may benefit to couple the the queries together.

    So if unless there is some holy grail of DONT do it I will have to give this as a win to the developers. I am looking for the best performance.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply