Stored Procedure Performance

  • Well you can change the column definition in the #contacts temp table to allow NULLs, but this does again question your data quality.

  • cbrammer1219 (7/7/2014)


    wBob (7/3/2014)


    The main problem with this proc is that you reference the CTE twice. SQL Server will process that twice because CTEs behave like views rather than temp tables. So that expensive remote query is seriously costing you. You can see this in the execution plan with the two near-identical looking branches. All the implicit conversion and data-type advice is good, you should sort that out, but you will transform performance of this query by only getting that data once.

    Here's how I would approach something like this with a sample rewrite below:

    1) Pull the data once into a temp table

    - get all required columns, add the one WHERE clause (on Default_list_no); maybe even don't add the WHERE clause as these can cause problems with linked servers.

    - experiment with applying the WHERE clause (on Default_list_no) after getting the data

    ie if most records are default_list_no = 00 then you might as well pull the whole table, then remove the few records

    2) Clean it

    - You see all that duplicate code for data cleaning, apply it once on data load. Or do it on your temp table once.

    3) then remove the records you don't want

    --!!TODO Add error handling

    --!!TODO Split @ext into a temp table and join to that instead

    CREATE PROCEDURE [dbo].[usp_TraceCallReport2]

    @startdate as date,

    @endDate as date,

    @ext as nvarchar(50)

    AS

    BEGIN

    SET NOCOUNT ON

    ------------------------------------------------------------------------------------------------

    -- Setup START

    -- Create temp tables required for the proc

    ------------------------------------------------------------------------------------------------

    IF OBJECT_ID('#contacts') IS NOT NULL DROP TABLE #contacts

    CREATE TABLE #contacts

    (

    AL_Contact_NameVARCHAR(50) NOT NULL,

    CompanyNameVARCHAR(50) NOT NULL,

    phoneVARCHAR(50) NOT NULL,

    cleanPhoneNumberBIGINT NULL

    )

    -- Setup END

    ------------------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------------------------

    -- Get data START

    ------------------------------------------------------------------------------------------------

    INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )

    SELECT DISTINCT

    a.Contact_Name as AL_Contact_Name

    , c.AR_COMPANY_NAME AS CompanyName --00 is the default list number

    , CONTACT_PHONE AS phone

    FROM ARLSQ01.dsg.dbo.DICE_ALCONCT AS a

    LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARSUBHD AS b ON a.ACCOUNT_NUMBER = b.ACCOUNT_NUMBER

    LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS c ON b.AR_NUMBER = c.AR_NUMBER

    WHERE a.Default_list_no = '00'

    INSERT INTO #contacts ( AL_Contact_Name, companyName, phone )

    SELECT DISTINCT

    d.FIRST_NAME + ' ' + d.LAST_NAME as ar_CONTACT_NAME

    , e.AR_COMPANY_NAME --001 is the default list number

    , REPLACE(REPLACE(REPLACE(REPLACE(d.PHONE_NUMBER, '(', ''), ')', ''), ' ', ''), '-', '') AS arPhone

    FROM ARLSQ01.dsg.dbo.DICE_ARCONTCT AS d

    INNER JOIN ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS e ON d.AR_NUMBER = e.AR_NUMBER

    WHERE d.list_number = '0001'

    -- Get data END

    ------------------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------------------------

    -- Process data START

    -- Clean and process the data

    ------------------------------------------------------------------------------------------------

    -- Remove records you don't want now

    DELETE #contacts WHERE phone LIKE '0%'

    DELETE #contacts WHERE phone LIKE '1111%'

    -- Now do the rank?

    ;WITH cte AS

    (

    select *, rnk = RANK() OVER(PARTITION BY companyname ORDER BY companyName)

    from #contacts

    )

    DELETE cte

    WHERE rnk > 1

    -- Clean the number if you haven't done so already

    UPDATE #contacts

    SET cleanPhoneNumber = REPLACE(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), ' ', ''), '-', '')

    -- Process data END

    ------------------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------------------------

    -- Resultset START

    ------------------------------------------------------------------------------------------------

    -- Resultset

    SELECT DISTINCT cStaging.ID

    ,cStaging.cDate

    ,cStaging.cStartTime

    ,cStaging.cDuration

    ,cStaging.cTimetoAnswer

    ,cStaging.callingparty

    ,cStaging.cDigitsDialed

    ,cStaging.cOrigCall

    ,cStaging.Cdestination

    ,cStaging.cTransfer1

    ,cStaging.cTransfer2

    ,cStaging.cCustPhone

    ,cStaging.cDirection

    ,cStaging.calledparty

    ,cStaging.cSystemID

    ,cStaging.cANI

    ,cStaging.cDNIS

    ,cStaging.cCallID

    ,cStaging.cCallIDSeq

    ,(

    CASE

    WHEN aj1.CompanyName IS NULL

    THEN aj2.CompanyName

    ELSE aj1.CompanyName

    END

    ) AS CompanyName, aj1.*,aj2.*

    FROM CallTrace AS cStaging

    LEFT JOIN #contacts AS aj1 ON LTRIM(RTRIM(aj1.cleanPhoneNumber)) = cStaging.Cdestination

    LEFT JOIN #contacts AS aj2 ON LTRIM(RTRIM(aj2.cleanPhoneNumber)) = cStaging.callingparty

    WHERE (CONVERT(DATE, cStaging.cDate) BETWEEN @startdate AND @endDate) AND (CONVERT(DATE, cStaging.cDate) <> '')

    AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit(@ext, ',') AS fnSplit_1))

    -- Resultset END

    ------------------------------------------------------------------------------------------------

    RETURN

    GO

    On my local test rig, for 10,000 rows, this reduced the performance from 2 minutes to 1 second.

    I have tried to eliminate the remote call by inserting the contact information into a table on the same server the other data is on, I'll do a daily run of the insert to make sure I get any new accounts added, or old accounts reactivated.

    Remove the NOT NULL constraints from the CREATE TABLE until the insert succeeds.

    IF OBJECT_ID('#contacts') IS NOT NULL DROP TABLE #contacts

    CREATE TABLE #contacts

    (

    AL_Contact_NameVARCHAR(50) NOT NULL,

    CompanyNameVARCHAR(50) NOT NULL,

    phoneVARCHAR(50) NOT NULL,

    cleanPhoneNumberBIGINT NULL

    )

  • I did that and it worked I got a lot of dups, so I did a MAX(cStaging.ID), and grouped that query of course, to avoid error, but then I got the error,

    Column '#contacts.AL_Contact_Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. So I did a group by on the select of the #contact, but not sure where this error is occurring now.

  • As I can't see your data, it's hard to tell where the duplicates are coming in. Can you post an example of the duplicate data? Anonymise your actual customer names and numbers.

    Have you altered the code much from what was provided? The logic of the rewrite was intended to be the same, however I may have missed something. For example have you already implemented the local contacts table you described earlier? Why do you think the duplicates are appearing?

    Just out if interest, what was the performance like? It should be easier to remove the duplicates than it was to fix the performance.

    Just from the type of questions you are now asking I would do some research in the following areas:

    Understanding NULLs

    Understanding Aggregates

    Consider purchasing Itzik Ben Gan's T-SQL Fundamentals book.

  • It ran in 31 seconds, WAY FASTER! Here's the data. I have not implemented the Master Contacts table.

  • You can, and definitely should, avoid converting cStaging.cDate if you can.

    --if "cDate" is a datetime column:

    --instead of this:

    --WHERE (CONVERT(DATE, cStaging.cDate) BETWEEN @startdate AND @endDate) AND (CONVERT(DATE, cStaging.cDate) <> '')

    --use this:

    WHERE cStaging.cDate >= @startdate AND cStaging.cDate < DATEADD(DAY, 1, @endDate)

    --you don't need to check a date for ''

    --if "cDate" is a varchar format that can be accurately directly compared to a

    --datetime string, convert the variable to the matching format

    --rather than the column itself; examples:

    --if cDate is in format "120":

    WHERE cStaging.cDate >= CONVERT(varchar(19), @start_date, 120) AND

    cStaging.cDate < CONVERT(varchar(19), DATEADD(DAY, 1, @end_date), 120)

    --if cDate is in format 101:

    --<no valid character comparison format>

    --if cDate is in format 102/111:

    WHERE cStaging.cDate >= CONVERT(varchar(10), @start_date, 102) AND

    cStaging.cDate < CONVERT(varchar(10), DATEADD(DAY, 1, @end_date), 102)

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Hmm I still can't quite tell (I'm away from keyboard this week an only have phone!). The two LEFT JOINs might be an issue, remove them one by one to see where the duplicates come in. Otherwise add some debugging statements to the proc eg SELECT * FROM #contacts after it's populated, see if you can work it out.

    Are you sure you didn't have duplicates originally? You could try restoring the original join logic

  • I created the master contacts table to eliminate the remote call, It does have dups, only if the Company Name or Contact name is spelled differently. I am however still getting a lot more dups than is expected.

    Here is my code I have modified your existing stored procedure.

    USE [AA_Helper]

    GO

    /****** Object: StoredProcedure [dbo].[usp_TraceCallReport2] Script Date: 7/3/2014 10:48:01 AM ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER PROCEDURE [dbo].[usp_TraceCallReport2]

    @startdate as date,

    @endDate as date,

    @ext as nvarchar(60)

    AS

    BEGIN

    SET NOCOUNT ON

    ------------------------------------------------------------------------------------------------

    -- Setup START

    -- Create temp tables required for the proc

    ------------------------------------------------------------------------------------------------

    IF OBJECT_ID('#contacts') IS NOT NULL DROP TABLE [#contacts]

    CREATE TABLE [dbo].[#contacts](

    [AR_COMPANY_NAME] [nvarchar](100) NOT NULL,

    [LIST_NO] [nvarchar](max) NULL,

    [CONTACT_NAME] [nvarchar](max) NULL,

    [PHONE] [nvarchar](35) NOT NULL

    )ON [PRIMARY]

    IF OBJECT_ID('ContactMaster') IS NOT NULL DROP TABLE [dbo].[ContactMaster]

    CREATE TABLE [dbo].[ContactMaster](

    [AR_COMPANY_NAME] [nvarchar](100) NOT NULL,

    [LIST_NO] [nvarchar](4) NULL,

    [CONTACT_NAME] [nvarchar](100) NOT NULL,

    [PHONE] [nvarchar](35) NOT NULL,

    PRIMARY KEY CLUSTERED

    (

    [AR_COMPANY_NAME] ASC,

    [PHONE] ASC,

    [CONTACT_NAME] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    -- Setup END

    ------------------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------------------------

    -- Get data START

    ------------------------------------------------------------------------------------------------

    INSERT INTO [#contacts] (AR_COMPANY_NAME, list_no ,CONTACT_NAME, phone )

    SELECT DISTINCT

    LTRIM(RTRIM(a.AR_COMPANY_NAME)) as COMPANY_NAME, c.default_list_no as list_no

    , c.Contact_Name AS CONTACT_NAME --00 is the default list number

    , case

    when ltrim(rtrim(c.CONTACT_PHONE))='( ) -' then ''

    when ltrim(rtrim(c.CONTACT_PHONE))='()' then ''

    when ltrim(rtrim(c.CONTACT_PHONE)) like '([0-9] ) -' then ''

    when ltrim(rtrim(c.CONTACT_PHONE)) like '([0-9][0-9] ) -' then ''

    else c.CONTACT_PHONE

    end as PHONE

    FROM ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS a

    LEFT JOIN ARLSQ01.dsg.dbo.DICE_ARSUBHD AS b ON a.AR_NUMBER = b.AR_NUMBER

    LEFT JOIN ARLSQ01.dsg.dbo.DICE_ALCONCT AS c ON b.ACCOUNT_NUMBER = c.ACCOUNT_NUMBER

    WHERE c.CONTACT_PHONE is not NULL

    INSERT INTO [#contacts](AR_COMPANY_NAME, list_no ,CONTACT_NAME, phone)

    SELECT DISTINCT

    LTRIM(RTRIM(d.AR_COMPANY_NAME)) as COMPANY_NAME , e.list_Number as list_no--001 is the default list number

    , rtrim(ltrim(rtrim(e.FIRST_NAME) + ' ' + e.LAST_NAME)) as CONTACT_NAME

    , case

    when ltrim(rtrim(e.PHONE_NUMBER))='( ) -' then ''

    when ltrim(rtrim(e.PHONE_NUMBER))='()' then ''

    when ltrim(rtrim(e.PHONE_NUMBER)) like '([0-9] ) -' then ''

    when ltrim(rtrim(e.PHONE_NUMBER)) like '([0-9][0-9] ) -' then ''

    when ltrim(rtrim(e.PHONE_NUMBER)) like '1111%' then ''

    when ltrim(rtrim(e.PHONE_NUMBER)) like '0%' then ''

    else e.PHONE_NUMBER

    end as PHONE

    FROM ARLSQ01.dsg.dbo.DICE_ARCUSTMN AS d

    INNER JOIN ARLSQ01.dsg.dbo.DICE_ARCONTCT AS e ON d.AR_NUMBER = e.AR_NUMBER

    WHERE e.PHONE_NUMBER is not NULL

    -- Get data END

    ------------------------------------------------------------------------------------------------

    DELETE FROM #contacts where LIST_NO = 'XXXX'

    ;WITH cte AS

    (

    select AR_COMPANY_NAME,LIST_NO,contact_name,phone, rnk = RANK() OVER(PARTITION BY AR_COMPANY_NAME,phone ORDER BY list_no)

    from #contacts

    )

    INSERT INTO [dbo].[ContactMaster]([AR_COMPANY_NAME],[LIST_NO],[CONTACT_NAME] ,[PHONE])

    SELECT AR_COMPANY_NAME,LIST_NO,contact_name,phone FROM cte

    WHERE rnk = 1

    -- Clean the number if you haven't done so already

    UPDATE #contacts

    SET phone = REPLACE(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), ' ', ''), '-', '') --remove clean portion of number above

    -- Process data END

    ------------------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------------------------

    -- Resultset START

    ------------------------------------------------------------------------------------------------

    -- Resultset

    SELECT DISTINCT cStaging.ID

    ,cStaging.cDate

    ,cStaging.cStartTime

    ,cStaging.cDuration

    ,cStaging.cTimetoAnswer

    ,cStaging.callingparty

    ,cStaging.cDigitsDialed

    ,cStaging.cOrigCall

    ,cStaging.Cdestination

    ,cStaging.cTransfer1

    ,cStaging.cTransfer2

    ,cStaging.cCustPhone

    ,cStaging.cDirection

    ,cStaging.calledparty

    ,cStaging.cSystemID

    ,cStaging.cANI

    ,cStaging.cDNIS

    ,cStaging.cCallID

    ,cStaging.cCallIDSeq

    ,(

    CASE

    WHEN aj1.AR_COMPANY_NAME IS NULL

    THEN aj2.CONTACT_NAME

    ELSE aj1.AR_COMPANY_NAME

    END

    ) AS CompanyName, aj1.*,aj2.*

    FROM CallTrace AS cStaging

    LEFT OUTER JOIN ContactMaster aj1 on REPLACE(REPLACE(REPLACE(REPLACE(aj1.PHONE, '(', ''), ')', ''), ' ', ''), '-', '') = cStaging.Cdestination

    LEFT OUTER JOIN ContactMaster aj2 on REPLACE(REPLACE(REPLACE(REPLACE(aj2.PHONE, '(', ''), ')', ''), ' ', ''), '-', '') = cStaging.Cdestination

    WHERE cStaging.cDate BETWEEN '2014-06-24' AND '2014-06-24'

    AND (cStaging.cOrigCall IN (SELECT NewParameter FROM dbo.fnSplit('2005', ',') AS fnSplit_1))

    -- Resultset END

    ------------------------------------------------------------------------------------------------

    RETURN

    END

    GO

    --select * FROM CallTrace

    --execute [dbo].[usp_TraceCallReport2] @startdate = '6/24/2014', @enddate = '6/24/2014', @ext = '2005'

  • There are a number of problems with that piece of code (and this is just what I can see via my phone):

    - the date parameters are not used in the proc

    - ContactMaster should probably be a temp table (#)

    - you join twice to ContactMaster on the same colum, Cdestination <-- MAJOR PROBLEM

    - you appear to have missed my point about duplicating code; clean the data once and once only. Try to avoid joins with functions.

    - you appear to have got rid of the cleanPhone column. Why?

    Make these corrections, see if you still get the duplicates.

Viewing 9 posts - 46 through 53 (of 53 total)

You must be logged in to reply to this topic. Login to reply