July 1, 2014 at 10:00 am
I have had to optimize several poor performing stored procedures and queries at my company and the solution in every case is often different. Here are a few things I would be checking or doing to address this situation:
- I would run the query in the CTE and see (as I suspect) if this is where the issue is. If so, then tackle that first. Clearly, you've received lots of feedback on the REPLACE so that is no doubt the Achilles heel of the operation.
- I believe you mentioned no PKs or FKs. If they can't be added to the table, perhaps its viable to put the data in table1 into a temp table, filtering on default_list_no = '00' during insert. Then add an index on account_number. Perhaps that can be done with table4 as well.
- This is a yellow, if not red flag to me:
AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit(@ext, ',') AS fnSplit_1))
I would pull that out and do that first, dumping the results into a temp table, adding an index on NewParameter:
SELECT NewParameter
into #Split_1
FROM dbo.fnSplit(@ext, ',')
create index ix_Split_1 on #Split_1 (NewParameter)
Now I can join to that temp table insead of using "cOrigCall in (select ... ":
FROM CallTrace AS cStaging
inner join #Split_1 as s on cStaging.cOrigCall = s.NewParameter
I would defer to the other responses first and if you are still having performance issues, then see if any of the above is helpful. I've had to get very creative over here and these are just a few examples of the things I would look at.
Good luck.
July 1, 2014 at 10:09 am
create index ix_Split_1 on #Split_1 (NewParameter)
Good call on the joining to a split-built derived table and how putting that into a temp table could make things faster. However, I doubt the index would be helpful in making performance faster. In fact I have VERY frequently found that single-hit temp tables where users put indexes on them lead to SLOWER overall performance. The cost of the scan, sort, index build can be significant and many servers already have a tempdb IO problem. The optimizer can and often will create stats on the joined-to column so it can still get very efficient plans coming out of it.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 1, 2014 at 10:41 pm
LSAdvantage (7/1/2014)
I have had to optimize several poor performing stored procedures and queries at my company and the solution in every case is often different. Here are a few things I would be checking or doing to address this situation:- I would run the query in the CTE and see (as I suspect) if this is where the issue is. If so, then tackle that first. Clearly, you've received lots of feedback on the REPLACE so that is no doubt the Achilles heel of the operation.
- I believe you mentioned no PKs or FKs. If they can't be added to the table, perhaps its viable to put the data in table1 into a temp table, filtering on default_list_no = '00' during insert. Then add an index on account_number. Perhaps that can be done with table4 as well.
- This is a yellow, if not red flag to me:
AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit(@ext, ',') AS fnSplit_1))
I would pull that out and do that first, dumping the results into a temp table, adding an index on NewParameter:
SELECT NewParameter
into #Split_1
FROM dbo.fnSplit(@ext, ',')
create index ix_Split_1 on #Split_1 (NewParameter)
Now I can join to that temp table insead of using "cOrigCall in (select ... ":
FROM CallTrace AS cStaging
inner join #Split_1 as s on cStaging.cOrigCall = s.NewParameter
I would defer to the other responses first and if you are still having performance issues, then see if any of the above is helpful. I've had to get very creative over here and these are just a few examples of the things I would look at.
Good luck.
fnSplit... can I see that code, please?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 2, 2014 at 10:54 am
Yes here it is...
USE [AA_Helper]
GO
/****** Object: UserDefinedFunction [dbo].[fnSplit] Script Date: 7/2/2014 12:53:58 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fnSplit]
(
@RepParam NVARCHAR(4000)
, @Delim CHAR(1)= ','
)
RETURNS @Values TABLE (NewParameter NVARCHAR(4000)) AS
BEGIN
DECLARE @chrind INT
DECLARE @Piece NVARCHAR(255)
SELECT @chrind = 1
WHILE @chrind > 0
BEGIN
SELECT @chrind = CHARINDEX(@Delim,@RepParam)
IF @chrind > 0
SELECT @Piece = LEFT(@RepParam,@chrind - 1)
ELSE
SELECT @Piece = @RepParam
INSERT INTO @Values(NewParameter)
VALUES(@Piece)
SELECT @RepParam = RIGHT(@RepParam,LEN(@RepParam) - @chrind)
IF LEN(@RepParam) = 0 BREAK
END
RETURN
END
GO
July 2, 2014 at 10:57 am
Given that it's a single string being split, I don't think there's much to gain by changing that code.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 2, 2014 at 10:58 am
Can a Index be on the Identity that is created on the insert?
July 2, 2014 at 11:05 am
cbrammer1219 (7/2/2014)
Can a Index be on the Identity that is created on the insert?
Yes, it can be.
For overall performance, what is really critical is determining the best column(s) for the clustered index. After that, you can worry about non-clustered index(es).
Identity is only rarely the best clustered index key. Reviewing your index usage, and SQL's missing index and index usage stats, can help you determine the best clustered index column(s).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 2, 2014 at 11:33 am
1) Search SSC.com for DelimitedSplit8K for a MUCH better split function. However, I predict that the splitting improvements will have essentially ZERO effect on the performance of this process. But using it as a cross-apply could get you a good index-seek plan assuming small numbers of rows will be hit on the "joined-to" tables and they are appropriately indexed for such.
2)
Identity is only rarely the best clustered index key.
Oooh, sorry, but I have to take a pretty strong objection to that blanket statement.
3) I am not clear on what exactly the commenter was meaning for having a clustered index on an identity. If it was on the temp table/table var see my previous post about indexing temp objects.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 3, 2014 at 9:01 am
The main problem with this proc is that you reference the CTE twice. SQL Server will process that twice because CTEs behave like views rather than temp tables. So that expensive remote query is seriously costing you. You can see this in the execution plan with the two near-identical looking branches. All the implicit conversion and data-type advice is good, you should sort that out, but you will transform performance of this query by only getting that data once.
Here's how I would approach something like this with a sample rewrite below:
1) Pull the data once into a temp table
- get all required columns, add the one WHERE clause (on Default_list_no); maybe even don't add the WHERE clause as these can cause problems with linked servers.
- experiment with applying the WHERE clause (on Default_list_no) after getting the data
ie if most records are default_list_no = 00 then you might as well pull the whole table, then remove the few records
2) Clean it
- You see all that duplicate code for data cleaning, apply it once on data load. Or do it on your temp table once.
3) then remove the records you don't want
--!!TODO Add error handling
--!!TODO Split @ext into a temp table and join to that instead
CREATE PROCEDURE [dbo].[usp_TraceCallReport2]
@startdate as date,
@endDate as date,
@ext as nvarchar(50)
AS
BEGIN
SET NOCOUNT ON
------------------------------------------------------------------------------------------------
-- Setup START
-- Create temp tables required for the proc
------------------------------------------------------------------------------------------------
IF OBJECT_ID('#contacts') IS NOT NULL DROP TABLE #contacts
CREATE TABLE #contacts
(
AL_Contact_NameVARCHAR(50) NOT NULL,
CompanyNameVARCHAR(50) NOT NULL,
phoneVARCHAR(50) NOT NULL,
cleanPhoneNumberBIGINT NULL
)
-- Setup END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Get data START
------------------------------------------------------------------------------------------------
INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )
SELECT DISTINCT
a.Contact_Name as AL_Contact_Name
, c.AR_COMPANY_NAME AS CompanyName --00 is the default list number
, CONTACT_PHONE AS phone
FROM ARLSQ01.dsg.dbo.DICE_ALCONCT AS a
LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARSUBHD AS b ON a.ACCOUNT_NUMBER = b.ACCOUNT_NUMBER
LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS c ON b.AR_NUMBER = c.AR_NUMBER
WHERE a.Default_list_no = '00'
INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )
SELECT DISTINCT
d.FIRST_NAME + ' ' + d.LAST_NAME as ar_CONTACT_NAME
, e.AR_COMPANY_NAME --001 is the default list number
, REPLACE(REPLACE(REPLACE(REPLACE(d.PHONE_NUMBER, '(', ''), ')', ''), ' ', ''), '-', '') AS arPhone
FROM ARLSQ01.dsg.dbo.DICE_ARCONTCT AS d
INNER JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS e ON d.AR_NUMBER = e.AR_NUMBER
WHERE d.list_number = '0001'
-- Get data END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Process data START
-- Clean and process the data
------------------------------------------------------------------------------------------------
-- Remove records you don't want now
DELETE #contacts WHERE phone LIKE '0%'
DELETE #contacts WHERE phone LIKE '1111%'
-- Now do the rank?
;WITH cte AS
(
select *, rnk = RANK() OVER(PARTITION BY companyname ORDER BY companyName)
from #contacts
)
DELETE cte
WHERE rnk > 1
-- Clean the number if you haven't done so already
UPDATE #contacts
SET cleanPhoneNumber = REPLACE(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), ' ', ''), '-', '')
-- Process data END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Resultset START
------------------------------------------------------------------------------------------------
-- Resultset
SELECT DISTINCT cStaging.ID
,cStaging.cDate
,cStaging.cStartTime
,cStaging.cDuration
,cStaging.cTimetoAnswer
,cStaging.callingparty
,cStaging.cDigitsDialed
,cStaging.cOrigCall
,cStaging.Cdestination
,cStaging.cTransfer1
,cStaging.cTransfer2
,cStaging.cCustPhone
,cStaging.cDirection
,cStaging.calledparty
,cStaging.cSystemID
,cStaging.cANI
,cStaging.cDNIS
,cStaging.cCallID
,cStaging.cCallIDSeq
,(
CASE
WHEN aj1.CompanyName IS NULL
THEN aj2.CompanyName
ELSE aj1.CompanyName
END
) AS CompanyName, aj1.*,aj2.*
FROM CallTrace AS cStaging
LEFT JOIN #contacts AS aj1 ON LTRIM(RTRIM(aj1.cleanPhoneNumber)) = cStaging.Cdestination
LEFT JOIN #contacts AS aj2 ON LTRIM(RTRIM(aj2.cleanPhoneNumber)) = cStaging.callingparty
WHERE (CONVERT(DATE, cStaging.cDate) BETWEEN @startdate AND @endDate) AND (CONVERT(DATE, cStaging.cDate) <> '')
AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit(@ext, ',') AS fnSplit_1))
-- Resultset END
------------------------------------------------------------------------------------------------
RETURN
GO
On my local test rig, for 10,000 rows, this reduced the performance from 2 minutes to 1 second.
July 5, 2014 at 7:26 pm
TheSQLGuru (7/1/2014)
create index ix_Split_1 on #Split_1 (NewParameter)
Good call on the joining to a split-built derived table and how putting that into a temp table could make things faster. However, I doubt the index would be helpful in making performance faster. In fact I have VERY frequently found that single-hit temp tables where users put indexes on them lead to SLOWER overall performance. The cost of the scan, sort, index build can be significant and many servers already have a tempdb IO problem. The optimizer can and often will create stats on the joined-to column so it can still get very efficient plans coming out of it.
I went through a rough few weeks trying to optimize stored procedures by tuning the indexes on temp tables, and just NOTHING helped (scans and scans and scans). Then I ran into a thread with you and Jeff saying to skip them, so I did and cut the proc time in half. What also helped was skipping the create #table syntax and just doing SELECT/INTO instead. May I humbly suggest you add that to your common T-SQL mistakes talk? 😎
Thanks
July 5, 2014 at 8:18 pm
If I added everything I wanted to my Common TSQL Mistakes talk it would be about 327 hours long!! 🙂 Actually I did give the talk on a webex for over 300 developers for one of the largest banks in America and was still going strong almost 3 hours into it with still over 200 on the line and they were just firing question after question at me and I was extemporaneously expounding on topic after topic. 😎
Unfortunately the temp table thing isn't QUITE the no-brainer I would like for it to be. Some edge case items are really that 0.02% or whatever. I would say the "don't index temp tables" thing is into the single digit percentage area maybe but when it IS right it is a magic bullet.
I do note that if you are on SQL 2012 build umptyscratch OR SQL 2014 the tempdb eager-write backoff makes the SELECT INTO a WAY WAY WAY more important thing to do!!!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 5, 2014 at 11:45 pm
TheSQLGuru (7/5/2014)
If I added everything I wanted to my Common TSQL Mistakes talk it would be about 327 hours long!! 🙂 Actually I did give the talk on a webex for over 300 developers for one of the largest banks in America and was still going strong almost 3 hours into it with still over 200 on the line and they were just firing question after question at me and I was extemporaneously expounding on topic after topic. 😎Unfortunately the temp table thing isn't QUITE the no-brainer I would like for it to be. Some edge case items are really that 0.02% or whatever. I would say the "don't index temp tables" thing is into the single digit percentage area maybe but when it IS right it is a magic bullet.
I do note that if you are on SQL 2012 build umptyscratch OR SQL 2014 the tempdb eager-write backoff makes the SELECT INTO a WAY WAY WAY more important thing to do!!!
Well, I could sit through it, but my wife might get pissed.
July 6, 2014 at 12:52 am
sqldriver (7/5/2014)
TheSQLGuru (7/1/2014)
create index ix_Split_1 on #Split_1 (NewParameter)
Good call on the joining to a split-built derived table and how putting that into a temp table could make things faster. However, I doubt the index would be helpful in making performance faster. In fact I have VERY frequently found that single-hit temp tables where users put indexes on them lead to SLOWER overall performance. The cost of the scan, sort, index build can be significant and many servers already have a tempdb IO problem. The optimizer can and often will create stats on the joined-to column so it can still get very efficient plans coming out of it.
I went through a rough few weeks trying to optimize stored procedures by tuning the indexes on temp tables, and just NOTHING helped (scans and scans and scans). Then I ran into a thread with you and Jeff saying to skip them, so I did and cut the proc time in half. What also helped was skipping the create #table syntax and just doing SELECT/INTO instead. May I humbly suggest you add that to your common T-SQL mistakes talk? 😎
Thanks
A non-clustered index on a temp heap table is indeed 100% a waste of time.
However, the best clustered index can be quite useful on temp tables if the index is created before the table is loaded. If a temp table is used multiple times, it's possible a clustered index could even be created after the table is loaded and still be very useful.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 7, 2014 at 8:12 am
wBob (7/3/2014)
The main problem with this proc is that you reference the CTE twice. SQL Server will process that twice because CTEs behave like views rather than temp tables. So that expensive remote query is seriously costing you. You can see this in the execution plan with the two near-identical looking branches. All the implicit conversion and data-type advice is good, you should sort that out, but you will transform performance of this query by only getting that data once.Here's how I would approach something like this with a sample rewrite below:
1) Pull the data once into a temp table
- get all required columns, add the one WHERE clause (on Default_list_no); maybe even don't add the WHERE clause as these can cause problems with linked servers.
- experiment with applying the WHERE clause (on Default_list_no) after getting the data
ie if most records are default_list_no = 00 then you might as well pull the whole table, then remove the few records
2) Clean it
- You see all that duplicate code for data cleaning, apply it once on data load. Or do it on your temp table once.
3) then remove the records you don't want
--!!TODO Add error handling
--!!TODO Split @ext into a temp table and join to that instead
CREATE PROCEDURE [dbo].[usp_TraceCallReport2]
@startdate as date,
@endDate as date,
@ext as nvarchar(50)
AS
BEGIN
SET NOCOUNT ON
------------------------------------------------------------------------------------------------
-- Setup START
-- Create temp tables required for the proc
------------------------------------------------------------------------------------------------
IF OBJECT_ID('#contacts') IS NOT NULL DROP TABLE #contacts
CREATE TABLE #contacts
(
AL_Contact_NameVARCHAR(50) NOT NULL,
CompanyNameVARCHAR(50) NOT NULL,
phoneVARCHAR(50) NOT NULL,
cleanPhoneNumberBIGINT NULL
)
-- Setup END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Get data START
------------------------------------------------------------------------------------------------
INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )
SELECT DISTINCT
a.Contact_Name as AL_Contact_Name
, c.AR_COMPANY_NAME AS CompanyName --00 is the default list number
, CONTACT_PHONE AS phone
FROM ARLSQ01.dsg.dbo.DICE_ALCONCT AS a
LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARSUBHD AS b ON a.ACCOUNT_NUMBER = b.ACCOUNT_NUMBER
LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS c ON b.AR_NUMBER = c.AR_NUMBER
WHERE a.Default_list_no = '00'
INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )
SELECT DISTINCT
d.FIRST_NAME + ' ' + d.LAST_NAME as ar_CONTACT_NAME
, e.AR_COMPANY_NAME --001 is the default list number
, REPLACE(REPLACE(REPLACE(REPLACE(d.PHONE_NUMBER, '(', ''), ')', ''), ' ', ''), '-', '') AS arPhone
FROM ARLSQ01.dsg.dbo.DICE_ARCONTCT AS d
INNER JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS e ON d.AR_NUMBER = e.AR_NUMBER
WHERE d.list_number = '0001'
-- Get data END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Process data START
-- Clean and process the data
------------------------------------------------------------------------------------------------
-- Remove records you don't want now
DELETE #contacts WHERE phone LIKE '0%'
DELETE #contacts WHERE phone LIKE '1111%'
-- Now do the rank?
;WITH cte AS
(
select *, rnk = RANK() OVER(PARTITION BY companyname ORDER BY companyName)
from #contacts
)
DELETE cte
WHERE rnk > 1
-- Clean the number if you haven't done so already
UPDATE #contacts
SET cleanPhoneNumber = REPLACE(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), ' ', ''), '-', '')
-- Process data END
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-- Resultset START
------------------------------------------------------------------------------------------------
-- Resultset
SELECT DISTINCT cStaging.ID
,cStaging.cDate
,cStaging.cStartTime
,cStaging.cDuration
,cStaging.cTimetoAnswer
,cStaging.callingparty
,cStaging.cDigitsDialed
,cStaging.cOrigCall
,cStaging.Cdestination
,cStaging.cTransfer1
,cStaging.cTransfer2
,cStaging.cCustPhone
,cStaging.cDirection
,cStaging.calledparty
,cStaging.cSystemID
,cStaging.cANI
,cStaging.cDNIS
,cStaging.cCallID
,cStaging.cCallIDSeq
,(
CASE
WHEN aj1.CompanyName IS NULL
THEN aj2.CompanyName
ELSE aj1.CompanyName
END
) AS CompanyName, aj1.*,aj2.*
FROM CallTrace AS cStaging
LEFT JOIN #contacts AS aj1 ON LTRIM(RTRIM(aj1.cleanPhoneNumber)) = cStaging.Cdestination
LEFT JOIN #contacts AS aj2 ON LTRIM(RTRIM(aj2.cleanPhoneNumber)) = cStaging.callingparty
WHERE (CONVERT(DATE, cStaging.cDate) BETWEEN @startdate AND @endDate) AND (CONVERT(DATE, cStaging.cDate) <> '')
AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit(@ext, ',') AS fnSplit_1))
-- Resultset END
------------------------------------------------------------------------------------------------
RETURN
GO
On my local test rig, for 10,000 rows, this reduced the performance from 2 minutes to 1 second.
I have tried to eliminate the remote call by inserting the contact information into a table on the same server the other data is on, I'll do a daily run of the insert to make sure I get any new accounts added, or old accounts reactivated.
July 7, 2014 at 8:23 am
When using your proc here I get this error???
Cannot insert the value NULL into column 'CompanyName', table 'tempdb.dbo.#contacts___________________________________________________________________________________________________________000000000173'; column does not allow nulls. INSERT fails.
Viewing 15 posts - 31 through 45 (of 53 total)
You must be logged in to reply to this topic. Login to reply