Recursive CTE? Not Sure How To Do This...

  • As promised, Auto, I'm back. Bit later than I'd hoped, but as usual I'm on the run again. Glad to see you were able to research my recommendation. I still stand by it, and would like to know why you feel you would prefer to avoid its implementation.

    Some comments in the code, but here's the full Quirky Update/Serial Update code with all the security traps you'll need. You'll find if you expand the data set significantly, you should get significant performance differences.

    /* Data Setup

    USE [tempdb]

    IF object_id('#Sales') IS NOT NULL

    DROP TABLE #Sales

    CREATE TABLE #Sales

    (

    SaleID INT IDENTITY(1,1) NOT NULL,

    CustID INT NOT NULL,

    SalesRepID INT NOT NULL,

    SaleDate datetime NOT NULL

    )

    -- Create a variable to hold the number of days

    -- after an initial Eligible sale

    -- for which subsequent sales to a given customer

    -- are denied commission

    DECLARE @LockDays TINYINT

    SET @LockDays = 1

    -- Insert a bunch of hypothetical sales for CustID 100

    INSERT INTO #Sales (CustID, SalesRepID, SaleDate)

    SELECT 100, 1, '2014-05-01 12:00:00'-- Sale #1: Eligible

    UNION ALL

    SELECT 100, 1, '2014-05-01 15:00:00'-- Sale #2: Ineligible, date is within lock period of Sale #1

    UNION ALL

    SELECT 100, 2, '2014-05-02 14:00:00'-- Sale #3: Eligible, date is after lock period of Sale #1

    UNION ALL

    SELECT 100, 3, '2014-05-02 11:00:00'-- Sale #4: Ineligible, date is within lock period of Sale #3

    UNION ALL

    SELECT 100, 1, '2014-05-02 11:01:00'-- Sale #5: Ineligible, date is within lock period of Sale #3

    UNION ALL

    SELECT 100, 1, '2014-05-02 11:02:00'-- Sale #6: Ineligible, date is within lock period of Sale #3

    UNION ALL

    SELECT 100, 1, '2014-05-03 11:03:00'-- Sale #7: Ineligible, date is within lock period of Sale #3

    UNION ALL

    SELECT 100, 1, '2014-05-03 14:00:00'-- Sale #8: Eligible, date is after lock period of Sale #3

    UNION ALL

    SELECT 100, 1, '2014-05-03 19:00:00'-- Sale #9: Ineligible, date is within lock period of Sale #8

    UNION ALL

    SELECT 100, 1, '2014-05-04 13:00:00'-- Sale #10: Ineligible, date is within lock period of Sale #8

    UNION ALL

    SELECT 100, 1, '2014-05-05 13:00:00'-- Sale #11: Eligible, date is after lock period of Sale #8

    UNION ALL

    SELECT 100, 1, '2014-05-06 13:00:00'-- Sale #12: Eligible, date is after lock period of Sale #11

    UNION ALL

    SELECT 100, 1, '2014-05-07 13:00:00'-- Sale #13: Eligible, date is after lock period of Sale #12

    UNION ALL

    SELECT 100, 1, '2014-05-08 13:00:00'-- Sale #14: Eligible, date is after lock period of Sale #13

    SELECT

    CustID,

    SalesRepID,

    SaleDate,

    'Eligibility Here' AS eligibility

    FROM #Sales

    */

    -- I always prefer to work with a #table as my source data can come from one

    --or many sources, and may in the future change. Forcing the #table usage

    --(within reason) cleans up many future schema change concerns.

    IF OBJECT_ID('tempdb..#QU') IS NOT NULL

    DROP TABLE #QU

    CREATE TABLE #QU

    (CustIDINTNOT NULL,

    SalesRepIDINTNOT NULL,

    SaleDateDATETIMENOT NULL,

    EligibleIndicator CHAR(1) NULL)

    CREATE CLUSTERED INDEX idx_qu ON #QU (CustID, SalesRepID, SaleDate)

    INSERT INTO #QU (CustID, SalesRepID, SaleDate)

    SELECT CustID, SalesRepID, SaleDate

    FROM#Sales

    DECLARE @CustIDINT,

    @SalesRepIDINT,

    @SaleDateDATETIME,

    @EligibleIndicatorCHAR(1),

    @LastEligibleSaleDateDATETIME

    SET @LastEligibleSaleDate = '19000101'

    SELECT * FROM #QU

    ;WITH SafeTable

    AS (SELECT

    Sequence = ROW_NUMBER() OVER ( ORDER BY CustID, SalesRepID, SaleDate),

    CustID,

    SalesRepID,

    SaleDate,

    EligibleIndicator

    FROM #QU

    )

    UPDATE SafeTable

    SET

    @EligibleIndicator = EligibleIndicator = CASE WHEN @CustID = CustID

    AND @SalesRepID = SalesRepID

    AND SaleDate < DATEADD( dd, 1, @LastEligibleSaleDate)

    -- Same customer, same sales rep, and date isn't a day forward

    THEN 'N'

    ELSE 'Y'

    END,

    @CustID = CustID,

    @SalesRepID = SalesRepID,

    @LastEligibleSaleDate = CASE WHEN @EligibleIndicator = 'Y' THEN SaleDate ELSE @LastEligibleSaleDate END

    SELECT * FROM #QU

    -- Now reconnect back to the source and update the column in question.

    Luck!

    EDIT: Dangit, it needs the AutomationSequence included and I forgot (until 5 minutes after I posted, of course...), but I need to run for a code roll. Check the post in the article for Jeff's article and you'll see a last check to be included in the QU code, or hopefully someone can do a quick edit for me. Sorry, just on the run tonight.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I was just about to post an update when I saw your reply, Craig!

    My update for the day, coincidentally, was that I actually spent a good deal of time today setting up a QU version of my code, and it ran about a gajillion times faster than the rCTE version.

    I'm currently just a contract hire, and my marching orders are to write heavily-commented vanilla T-SQL that "any T-SQL developer" can easily modify or understand once I'm gone in a month or two. Hence my hesitation to use QU. But after seeing the speed for myself, I have to see if I can talk my boss into giving QU a try. And I'll leave him the rCTE code too in case he wants it later.

    Oh, and btw, turns out I don't have r/w permissions for tempdb on this system. And since I don't think the clustered index/QU works with a table variable, I'm going to just create a normal table to do the processing. But I will be sure to drop and create it each time to prevent anyone from screwing up its QU-specific design.

    So anyway, I will look over your code in detail on Friday and post another update then. 🙂 This has been one of the most fascinating SQL projects I've ever worked on thanks to you guys!


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • my take on the QU process.....virtually same as Craig's but I think I have added in all the safety checks required ....

    uses a million row test table.

    all credit should be directed to Jeff Moden for the following code....quirky update and the random data generator

    USE [tempdb]

    /*create a million row test table*/

    IF object_id('Sales') IS NOT NULL DROP TABLE Sales

    DECLARE @NumberOfRows INT

    , @StartDate DATETIME

    , @EndDate DATETIME

    , @Days INT

    SELECT @NumberOfRows = 1000000

    , @StartDate = '2013' /*Inc*/

    , @EndDate = '2014' /*Exc*/

    , @Days = DATEDIFF(dd, @StartDate, @EndDate)

    SELECT TOP (@NumberOfRows)

    SaleID = IDENTITY(INT, 1, 1)

    , CustID = 1 + CAST(Abs(Checksum(Newid()) % 900) AS INT)

    , SalesRepID = 1 + CAST(Abs(Checksum(Newid()) % 9) AS INT)

    /*, Saledate = RAND(CHECKSUM(NEWID())) * @Days + @StartDate /* dd/mm/yyyy hh:mi:ss.mmm full date*/*/

    /*, SaleDate = ABS(CHECKSUM(NEWID())) % @Days + @StartDate /* dd/mm/yyyy whole date only*/*/

    , SaleDate = DATEADD(mi, DATEDIFF(mi, 0, (RAND(CHECKSUM(NEWID())) * @Days + @StartDate)), 0) /* dd/mm/yyyy hh:min*/

    , eligibility = 0

    INTO Sales

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    /*if original table does not have correct Clustered index for QU to run then insert required columns to a temp table

    this can then be updated with QU and either queried or used as source to update the original table*/

    IF OBJECT_ID('#tempsales') IS NOT NULL DROP TABLE #tempsales

    SELECT CustID

    , SalesRepID

    , SaleDate

    , eligibility

    INTO #tempsales

    FROM Sales

    CREATE CLUSTERED INDEX cix_jls ON #tempsales (CustID, SalesRepID, SaleDate) /*required for QU to run correctly*/

    /*the quirky update code */

    DECLARE @CustID INT

    , @SalesRepID INT

    , @SaleDate DATETIME

    , @eligibility INT

    , @eligibilityDate DATETIME

    , @SafetyCounter BIGINT

    , @LockDays TINYINT

    SET @eligibilityDate = (select min(saledate) from #tempsales)

    SET @LockDays = 1

    SET @SafetyCounter = 1

    ;WITH cte_tempsales /*Adds a "safety counter" to the rows in the expected processing order which is used as a check in QU*/

    AS (

    SELECT SafetyCounter = ROW_NUMBER() OVER (ORDER BY CustID, SalesRepID, SaleDate)

    , CustID

    , SalesRepID

    , SaleDate

    , eligibility

    FROM #tempsales

    )

    UPDATE cte_tempsales

    SET @eligibility = eligibility =

    CASE /*This CASE does the safety check*/

    WHEN SafetyCounter = @SafetyCounter /*Checks sequence of processing*/

    THEN CASE

    WHEN @CustID = CustID

    AND @SalesRepID = SalesRepID

    AND SaleDate < DATEADD(dd, @LockDays, @eligibilityDate)

    THEN 0 /* ineligible*/

    ELSE 1 /* eligible*/

    END

    ELSE 1/0 /*- force error if QU out of sync*/

    END

    , @SafetyCounter = @SafetyCounter + 1

    , @CustID = CustID

    , @SalesRepID = SalesRepID

    , @eligibilityDate =

    CASE

    WHEN @eligibility = 1

    THEN SaleDate

    ELSE @eligibilityDate

    END

    FROM cte_tempsales WITH (TABLOCKX) /* required for QU*/

    OPTION (MAXDOP 1)/* required for QU*/

    /*results*/

    SELECT top 10000 *

    FROM #tempsales

    order by custid,SalesRepID

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Thanks for the extra work, J Livingston! I didn't think of updating a CTE rather than the temp table itself. As long as it follows the same rules vis a vis the clustered index, that seems like a handy improvement.

    And thanks for filling in the safety checks. I was just in the process of writing my own, so I will compare against what you've done here to make sure I'm covered. You rock.

    Update: Everything is working great, running super-duper fast, and all the safety checks and error-handling are working as expected! Woo hoo! 🙂


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • Glad to hear you're good to go auto.

    Thanks for the cover J Livingston. Can't believe I forgot to put in the extra security wrapper... d'oh! :hehe:


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Update:

    J Livingston's version of Craig's version of Jeff's QU is basically the solution I am going with. Thanks again to J and also to everyone else involved in getting us to that point.

    For anyone else following along, I just thought I'd mention however that I omitted SalesRepID from my index and from my case statements on eligibility. My index and row_number, etc. were done on CustID, SaleDate.

    Also, for whatever reason, I wasn't able to get it all working with the 3-part assignment syntax. Maybe my fault. But it's all working great with the more verbose 2-part assignments.

    😀


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • autoexcrement (5/24/2014)


    Also, for whatever reason, I wasn't able to get it all working with the 3-part assignment syntax. Maybe my fault. But it's all working great with the more verbose 2-part assignments.

    😀

    wondering why you couldn't get it to work with 3-part?

    ...care to share your code?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Normally I would be sharing all my code. But this particular time it would not be appropriate to share. 🙁 I had to abstract the problem and the code a lot to even post the original question.

    I will give the 3-part assignment another go next week. I bet it is just something screwy that I did. But the main message is: QU works, and it works gloriously. 🙂


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • autoexcrement (5/24/2014)


    For anyone else following along, I just thought I'd mention however that I omitted SalesRepID from my index and from my case statements on eligibility. My index and row_number, etc. were done on CustID, SaleDate.

    😀

    I'm not sure which index is "my index" above and, apologies, I don't have the time to go back through this post but I want to make sure that you understand that if SalesRepID is one of the columns that you're relying on for the proper order of the QU, it needs to be included in the clustered index that the QU uses.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yessir, all is well! 🙂 Thank you for checking back in!


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • Hi auto....code without the SalesRepId.......based on previous code provided....still using the temp table.

    IF OBJECT_ID('#tempsales') IS NOT NULL DROP TABLE #tempsales

    SELECT CustID

    , SalesRepID

    , SaleDate

    , eligibility

    INTO #tempsales

    FROM Sales

    CREATE CLUSTERED INDEX cix_jls ON #tempsales

    (CustID,

    /*SalesRepID, */

    SaleDate) /*required for QU to run correctly*/

    /*the quirky update code */

    DECLARE @CustID INT

    , @SalesRepID INT

    , @SaleDate DATETIME

    , @eligibility INT

    , @eligibilityDate DATETIME

    , @SafetyCounter BIGINT

    , @LockDays TINYINT

    SET @eligibilityDate = (select min(saledate) from #tempsales)

    SET @LockDays = 1

    SET @SafetyCounter = 1

    ;WITH cte_tempsales /*Adds a "safety counter" to the rows in the expected processing order which is used as a check in QU*/

    AS (

    SELECT SafetyCounter = ROW_NUMBER() OVER

    (ORDER BY

    CustID,

    /*SalesRepID, */

    SaleDate)

    , CustID

    /*, SalesRepID*/

    , SaleDate

    , eligibility

    FROM #tempsales

    )

    UPDATE cte_tempsales

    SET @eligibility = eligibility =

    CASE /*This CASE does the safety check*/

    WHEN SafetyCounter = @SafetyCounter /*Checks sequence of processing*/

    THEN CASE

    WHEN @CustID = CustID

    /*AND @SalesRepID = SalesRepID */

    AND SaleDate < DATEADD(dd, @LockDays, @eligibilityDate)

    THEN 0 /* ineligible*/

    ELSE 1 /* eligible*/

    END

    ELSE 1/0 /*- force error if QU out of sync*/

    END

    , @SafetyCounter = @SafetyCounter + 1

    , @CustID = CustID

    /*, @SalesRepID = SalesRepID*/

    , @eligibilityDate =

    CASE

    WHEN @eligibility = 1

    THEN SaleDate

    ELSE @eligibilityDate

    END

    FROM cte_tempsales WITH (TABLOCKX) /* required for QU*/

    OPTION (MAXDOP 1)/* required for QU*/

    /*results*/

    SELECT top 10000 *

    FROM #tempsales

    order by custid,SaleDate

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply