Help with Split Proc

  • This was a nightmare procedure and you guys helped me with it last time. I added the split function and it worked great. However, now it's taking over 1 1/2 minute to run (use to be 15 seconds) when I  run profiler, it seems to be hanging on the 'split' portion of the script. Anyone have any idea's, is there something else I can use besides the split that would have better performance?

     

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_NULLS ON

    GO

     

    ALTER    PROCEDURE dbo.usp_GetRNLeadsCapOptimization --30,1,1, 0, 1, 94546

     @LookBackDays   INT,

     @LeadTimeFrame   INT,

     @PropertyValue   INT,

     @WorkWithRealtor TINYINT,

     @LeadTypeID  TINYINT,

     @strZIP   VARCHAR(5)

     

    AS

     DECLARE @PriceRange MONEY,  

      @DateIn     DATETIME,

      @string     varchar(200)

    BEGIN

        

       --SET  @DateIn = CAST(CONVERT(VARCHAR(10), DATEADD(d, -@LookBackDays, CURRENT_TIMESTAMP), 101) AS DATETIME) + CAST('12:00:00AM' AS DATETIME)

       --above line commented and substituted for devdb1 testing

       SET @DateIn = CAST(CONVERT(VARCHAR(10), DATEADD(d, -@LookBackDays, '03/01/2006'), 101) AS DATETIME) + CAST('12:00:00AM' AS DATETIME)

       SET @PriceRange =

     (CASE WHEN @PropertyValue BETWEEN 1 AND 10

      THEN @PropertyValue * 100000

      WHEN @PropertyValue BETWEEN 11 AND 18

      THEN (@PropertyValue -10) * 1000000 + 1000000

                            ELSE 10000000 end)

     

    if @LeadTimeFrame = 1       set @string='24'

    else if @LeadTimeFrame = 2  set @string='23,24'

    else if @LeadTimeFrame = 3  set @string='23,24,25'

    else if @LeadTimeFrame = 4  set @string='23,24,25,107'

    else if @LeadTimeFrame = 5  set @string='96'

       IF (@WorkWithRealtor = 0)

       BEGIN

      IF (@LeadTypeID = 1)

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 1 -- BUY

         AND ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

         AND tlq.time_frame_id IN (select element from master.dbo.split(@string,','))

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn       

         AND (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP OR ISNULL(tlq.buy_zip_3, '') = @strZIP)  

         UNION

      

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 3 --BUY/SELL

         AND (

             ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

             OR

             ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

             )

         AND tlq.time_frame_id IN (select element from master.dbo.split(@string,','))

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn 

         AND (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP OR ISNULL(tlq.buy_zip_3, '') = @strZIP)

         ORDER BY tlq.lead_id, tlq.lead_type_id

      END

      

      ELSE IF(@LeadTypeID = 2)

      BEGIN

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 2 --SELL

         AND ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

         AND tlq.time_frame_id IN (select element from master.dbo.split(@string,','))

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn

         AND tlq.zip = @strZIP

                UNION

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 3 --BUY/SELL

         AND (

             ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

             OR

             ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

             )

         AND tlq.time_frame_id IN (select element from master.dbo.split(@string,','))

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn

         AND tlq.zip = @strZIP

         ORDER BY tlq.lead_id, tlq.lead_type_id 

     end

    end

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    --------------------------------------------------------

     

    CREATE FUNCTION dbo.Split ( @vcDelimitedString   varchar(8000),

        @vcDelimiter   varchar(100) )

    /**************************************************************************

    DESCRIPTION: Accepts a delimited string and splits it at the specified

      delimiter points.  Returns the individual items as a table data

      type with the ElementID field as the array index and the Element

      field as the data

    PARAMETERS:

      @vcDelimitedString  - The string to be split

      @vcDelimiter   - String containing the delimiter where

           delimited string should be split

    RETURNS:

      Table data type containing array of strings that were split with

      the delimiters removed from the source string

    USAGE:

      SELECT ElementID, Element FROM Split('11111,22222,3333', ',') ORDER BY ElementID

    AUTHOR: Karen Gayda

    DATE:  05/31/2001

    MODIFICATION HISTORY:

     WHO  DATE  DESCRIPTION

     ---  ---------- ---------------------------------------------------

    ***************************************************************************/

    RETURNS @tblArray TABLE

       (

     ElementID smallint IDENTITY(1,1),  --Array index

        Element  varchar(1000)   --Array element contents

       )

    AS

    BEGIN

     DECLARE

     @siIndex     smallint,

     @siStart     smallint,

     @siDelSize     smallint

     SET @siDelSize = LEN(@vcDelimiter)

     --loop through source string and add elements to destination table array

     WHILE LEN(@vcDelimitedString) > 0

     BEGIN

      SET @siIndex = CHARINDEX(@vcDelimiter, @vcDelimitedString)

      IF @siIndex = 0

      BEGIN

       INSERT INTO @tblArray VALUES(@vcDelimitedString)

       BREAK

      END

      ELSE

      BEGIN

       INSERT INTO @tblArray VALUES(SUBSTRING(@vcDelimitedString, 1,@siIndex - 1))

       SET @siStart = @siIndex + @siDelSize

       SET @vcDelimitedString = SUBSTRING(@vcDelimitedString, @siStart , LEN(@vcDelimitedString) - @siStart + 1)

      END

     END

     

     RETURN

    END

    ---------------------------------------------------------------

     

     

  • You only need to use split if you are forced to accept comma-divided strings by (a)circumstances beyond your control (pre-existing data, legacy system), or (b) the fact that it is impossible to pass a table variable as input to a stored proc or function. (I would use XML for the latter though).

    In this case, there is no need to wrap your data up and then unwrap it in the query.

    Instead of

    if @LeadTimeFrame = 1       set @string='24'

    else if @LeadTimeFrame = 2  set @string='23,24'

    else if @LeadTimeFrame = 3  set @string='23,24,25'

    else if @LeadTimeFrame = 4  set @string='23,24,25,107'

    else if @LeadTimeFrame = 5  set @string='96'

    use a table variable:

    declare @args table(val int)
    if @LeadTimeFrame = 5 
    insert @args values(96)
    else
    begin
    if @LeadTimeFrame > 0 insert @args values(24)
    if @LeadTimeFrame > 1 insert @args values(23)
    if @LeadTimeFrame > 2 insert @args values(25)
    if @LeadTimeFrame > 3 insert @args values(107)
    end

     

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • >>AND (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP

    You may have a less than optimal execution plan due to "parameter sniffing" and use of the @strZIP variable. Searchthis site for info on "parameter sniffing".

    Use of ISNULL() (or other functions) in the WHERE may prevent use of indexes on the tlq.buy_zip_2 column.

    Use of UNION instead of UNION ALL may force a costly DISTINCT operation. If the 2 resultsets being UNION'ed together are mutually exclusive, use UNION ALL.

     

  • I've added the union all and just wanted to make sure I had the syntax right for the select

        AND tlq.time_frame_id IN (select val from @args)

    Thanks for your help!!

    alter   PROCEDURE dbo.usp_GetRNLeadsCapOptimization 2,1,1, 0, 1, 94546

     @LookBackDays   INT,

     @LeadTimeFrame   INT,

     @PropertyValue   INT,

     @WorkWithRealtor TINYINT,

     @LeadTypeID  TINYINT,

     @strZIP   VARCHAR(5)

     

    AS

     DECLARE @PriceRange MONEY,  

      @DateIn     DATETIME,

      @string     varchar(200)

    BEGIN

        

       --SET  @DateIn = CAST(CONVERT(VARCHAR(10), DATEADD(d, -@LookBackDays, CURRENT_TIMESTAMP), 101) AS DATETIME) + CAST('12:00:00AM' AS DATETIME)

       --above line commented and substituted for devdb1 testing

       SET @DateIn = CAST(CONVERT(VARCHAR(10), DATEADD(d, -@LookBackDays, '03/01/2006'), 101) AS DATETIME) + CAST('12:00:00AM' AS DATETIME)

       SET @PriceRange =

     (CASE WHEN @PropertyValue BETWEEN 1 AND 10

      THEN @PropertyValue * 100000

      WHEN @PropertyValue BETWEEN 11 AND 18

      THEN (@PropertyValue -10) * 1000000 + 1000000

                            ELSE 10000000 end)

     

    declare @args table(val int)

    if @LeadTimeFrame = 5 

    insert @args values(96)

    else

    begin

    if @LeadTimeFrame > 0 insert @args values(24)

    if @LeadTimeFrame > 1 insert @args values(23)

    if @LeadTimeFrame > 2 insert @args values(25)

    if @LeadTimeFrame > 3 insert @args values(107)

    end

       IF (@WorkWithRealtor = 0)

       BEGIN

      IF (@LeadTypeID = 1)

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 1 -- BUY

         AND ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

         AND tlq.time_frame_id IN (select val from @args)

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn       

         AND (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP OR ISNULL(tlq.buy_zip_3, '') = @strZIP)  

         UNION

      

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 3 --BUY/SELL

         AND (

             ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

             OR

             ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

             )

         AND tlq.time_frame_id IN (select val from @args)

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn 

         AND (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP OR ISNULL(tlq.buy_zip_3, '') = @strZIP)

         ORDER BY tlq.lead_id, tlq.lead_type_id

      END

      

      ELSE IF(@LeadTypeID = 2)

      BEGIN

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 2 --SELL

         AND ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

         AND tlq.time_frame_id IN (select val from @args)

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn

         AND tlq.zip = @strZIP

                UNION all

         SELECT tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

         tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2, ISNULL(tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

         FROM tbl_leads_queue tlq WITH (NOLOCK)

         WHERE tlq.lead_id = tlq.parent_lead_id

         AND tlq.NumSold = 0  

         AND tlq.is_available = 1

         AND tlq.Expired = 0

         AND tlq.has_agent = 0

         AND tlq.lead_type_id = 3 --BUY/SELL

         AND (

             ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

             OR

             ISNULL(tlq.sell_price_qualifying, 0) >= @PriceRange

             )

         AND tlq.time_frame_id IN (select val from @args)

         AND NOT EXISTS (SELECT lead_id FROM tbl_leads_queue WITH (NOLOCK)

           WHERE parent_lead_id = tlq.parent_lead_id

           AND (lead_status_id = 3 OR lead_status_id = 4))

         AND tlq.date_in >= @DateIn

         AND tlq.zip = @strZIP

         ORDER BY tlq.lead_id, tlq.lead_type_id 

     end

    end

    GO

     

     

  • Here's how I would change the first SELECT. As mentioned above, it would help if the zip columns could be compared without wrapping them in isnull(). What exactly are the rules if NULL or empty string is passed in as value of @strZIP, or if they exist in the table columns?

    SELECT

    tlq.lead_id, tlq.parent_lead_id, tlq.lead_type_id,

    tlq.buy_zip_1, ISNULL(tlq.buy_zip_2, '') AS buy_zip_2,

    ISNULL

    (tlq.buy_zip_3, '') AS buy_zip_3, tlq.zip

    FROM

    tbl_leads_queue tlq WITH (NOLOCK) --nolock is not normally a good idea...

    JOIN

    @args a --use a JOIN instead of IN - not much diff. with a tiny recordset, but good practice.

    ON

    a.val = tlq.time_frame_id

    LEFT

    JOIN tbl_leads_queue tlq2 WITH (NOLOCK) --instead of NOT EXISTS, use LEFT JOIN with IS NULL in WHERE clause

    ON

    tlq2.parent_lead_id = tlq.parent_lead_id

    AND

    tlq2.lead_status_id IN (3,4) --use IN instead of OR - primarily for readability

    WHERE

    tlq.lead_id = tlq.parent_lead_id

    AND

    tlq.NumSold = 0

    AND

    tlq.is_available = 1

    AND

    tlq.Expired = 0

    AND

    tlq.has_agent = 0

    AND

    tlq.lead_type_id = 1

    AND

    ISNULL(tlq.buy_price_range_start, 0) >= @PriceRange

    AND

    tlq.time_frame_id

    AND

    tlq.date_in >= @DateIn

    AND

    (tlq.buy_zip_1 = @strZIP OR ISNULL(tlq.buy_zip_2, '') = @strZIP OR ISNULL(tlq.buy_zip_3, '') = @strZIP)

    AND tlq2.parent_lead_id IS NULL --instead of NOT EXISTS, use LEFT JOIN with IS NULL in WHERE clause

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • I'm not sure why you need the performance hit of the split function...

    For example, here's an example that splits a whole table at once (this one splits on periods which you can easily change)...

    --===== IF test table exists, drop it
         IF OBJECT_ID('dbo.JbmParseTest') IS NOT NULL
            DROP TABLE dbo.JbmParseTest
    --===== Create the test table and populate it
     CREATE TABLE dbo.JbmParseTest (RowNum INT IDENTITY(1,1), MyString VARCHAR(200))
     INSERT INTO dbo.JbmParseTest (MyString)
     SELECT 'CEO.New Address Validated.Location: \\Main\e:\NewAddressValue.Last Update Jeff Morgan' UNION ALL
     SELECT 'CSR_HelpDesk.Starttime: 9:30AM Midshift.Changed Supervisor HR request.New Sup Anna Holland.New Shift: 7:30 AM'
     SELECT j.RowNum,
            LTRIM(SUBSTRING(j.MyString, t.N+1, CHARINDEX('.', j.MyString, t.N + 1) - t.N - 1)) AS ParsedString
       FROM Tally t
      INNER JOIN 
            (SELECT RowNum,'.'+MyString+'.' AS MyString FROM dbo.JbmParseTest) j 
         ON SUBSTRING(j.MyString, t.N, 1) = '.' 
        AND t.N < LEN(j.MyString) 

    Notice that it returns the Primary Key (not defined in this example but is supposed to be the RowNum in this case).  You can use a similar query as a derived table and probably improve performance a bit (as well as following some of the other folks' hints like not using a formula on column in the WHERE clause).

    If you don't have a Tally table, here's how to make one... note that the cross join is intentional...

    --===== Create and populate the Tally table on the fly
    SELECT TOP 9999
           IDENTITY(INT,1,1) AS N
      INTO dbo.Tally
      FROM Master.dbo.SysColumns sc1,
           Master.dbo.SysColumns sc2
    --===== Add a Primary Key to maximize performance
      ALTER TABLE dbo.Tally
            ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)
    --===== Allow the general public to use it
      GRANT SELECT ON dbo.Tally TO PUBLIC

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You definitely don't need any sort of split function.

    The code you posted is fine, with the (optional) amendments I suggested above.

    Getting rid of the ISNULLs would definitely help. Locigally speaking, this would do it:

    AND (
    (@strZIP = '' AND (tlq.buy_zip_2 IS NULL OR tlq.buy_zip_2 IS NULL))
    OR tlq.buy_zip_1 = @strZIP
    OR tlq.buy_zip_2 = @strZIP
    OR tlq.buy_zip_3 = @strZIP
    )

    the complexity of the expression might cuase the optimiser some problems, but at least it could probably use and index on the ZIP codes.

    A couple of issues:

    Unless the three ZIP codes have distinct and clearly defined purposes, i.e. swapping values between them would give clearly incorrect data, having three ZIP code columns is a violation of 1st normal form. From a table design point of view, it would be better to hive these off into single column of a separate table, with a foreign key to the parent record, and possibly a 'ZIP code type' e.g. (main, other) to represent its relationship to the parent record. Obviously a wide range of considerations (dependencies, query efficiency,...) might militate against doing so even if it would be prefereable from a DB design point of view. It's always worth being aware of such denormalisation though, so that you are clear where design compromises exist in the schema and why.

    The expanded version of your query predicate shows that you want to return records where the zip entered matches one of the three ZIP codes, or where the zip entered is '', and one of zip 2 and zip3 IS NULL. Is this right? Revisting the requirements might enable you to simplify this clause.

     

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply