Usage of functions in the stored procedures
I am a big fan of functions that they can be used in our SQL stored procedures or direct queries, because they are very handy and reusable. There are some functions that have been written in the application that I work, which will be like getting the session id or getting a date by applying the UTC Offset per the locale settings. Those functions I used to enjoy while I write new procedures. That saves some time by just calling them from my procedures and getting my work done, without knowing how actually they do so. I work in a company who develops Supply Chain Solutions for the global Supply Chain Leaders.
One fine day I got an issue from a customer saying the shipments page is not loading when he click the Shipment tab. It was taking incredibly more time and at last got timed out. On my investigation I found that the Shipment has more than 15,000 cases (These are the physical cases need to be loaded to a Truck/Trailer) associated with it. There was a function which will get the Store Id from some table with some conditions (joins). The culprit was this function. This function was called in the select statement. So for the 15,000 cases this function has been called repeatedly for the same data.
There are several ways to break down this scenario. One is to make the function inline rather than calling it. This can be done by having the function logic right there in our proc where it creates a bottleneck. This approach is good if each record set will be getting different values by the existing function call.
For my scenario the Store Id was always same for all the 15,000 records. So I used the second approach. The second approach is to get the value prior to performing the select. This will be applicable only if the value is remaining same throughout for the records. For my Shipments case it was true since all the cases belong to a particular Store.
There is a third approach. However it is unlikely to be used, since it is the best solution on a performance point of viewJ. This is applicable only if the data returned from the function is same for the entire rows.
Let’s see these concepts with an example to understand well. The Scenario is simple. We have two numbers whose sum needs to be found for each case associated with the shipment. We will see how we can do it using a function, making the call inline and using a variable outside select block.
We will create a table CaseDetail and populate the data.
CREATE TABLE CaseDetail (CaseId INT IDENTITY(1,1), QTY NUMERIC(14,4), weightedQty NUMERIC(14,4), ShipmentId INT)
--We assume that these cases belongs to Shipment with id 1
DECLARE @count INT
SET @count =15000
WHILE (@count >0)
BEGIN
INSERT INTO CaseDetail VALUES(200,300,1)
SET @count = @count -1
END
We have CaseDetail with 15,000 records inserted and they belong to the Shipment with id 1.
Now we will create the function for getting the sum calculation.
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE FUNCTION [dbo].[getSum]
(
@sid NVARCHAR(80),
@shipmentId INT
) RETURNS NUMERIC(14,4)
AS BEGIN
DECLARE @total NUMERIC(14,4)
SELECT @total = Qty+weightedQTY FROM dbo.CaseDetail WHERE shipmentId = @shipmentId
RETURN @total
END
Now we will have the procedure which uses this function.
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE PROCEDURE [dbo].[GetItemDetails](@sid NVARCHAR(80))
AS BEGIN
DECLARE@shipmentId INT
SET@shipmentId = 1
SELECT
QTY AS Qty,
WeightedQTY AS WeightedQty,
[dbo].[getSum](@sid,@shipmentId)as TOTAL
FROM CaseDetail WHERE ShipmentId = @shipmentId
END
Let’s try executing it (EXEC [dbo].[GetItemDetails] '8C77AD39-FDD1-43B5-81CE-02600B68A376'). We can see that it is taking some considerable amount of time. For me it took 128590ms. I have used the SQL Profiler (Start - >Run ‘Profiler’ for SQL Server 2000 or ‘Profiler90’ for SQL Server 2005) to get the execution duration stated above. Now let’s see avoiding the function call and making the function logic inline in the procedure.
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE PROCEDURE [dbo].[GetItemDetails_Inline](@sid NVARCHAR(80))
AS BEGIN
DECLARE@shipmentId INT
SET@shipmentId = 1
SELECT
Qty AS Qty,
weightedQTY AS WeightedQty,
Qty+weightedQTY AS TOTAL
FROM dbo.CaseDetail WHERE shipmentId = @shipmentId
END
While executing we can see that the performance have been increased considerably. For me it took just 506ms.
Now let’s see the best solution for the scenario, by getting the value in a variable. As I told earlier this will be applicable only if we are sure that the data is same for all the rows.
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE PROCEDURE [dbo].[GetItemDetails_Variable](@sid NVARCHAR(80))
AS BEGIN
DECLARE@shipmentId INT
DECLARE @total NUMERIC(14,4)
SET@shipmentId = 1
SELECT TOP 1 @total = qty + weightedQTY FROM CaseDetail WHERE shipmentId = @shipmentId
SELECT
Qty AS Qty,
weightedQty AS WeightedQty,
@total AS TOTAL
FROM CaseDetail WHERE shipmentId = @shipmentId
END@total as TOTAL
End
Execution of this proc took just 384ms.
So we have seen that calling a function from our select statement need to be chosen carefully, since it can make performance overheads when the function contains complex logic and/or there is a large number of records to return. We can use them if we are sure that the records will be minimal always.
Hope this information was helpful.
Thank you,
Sreeju