How to query for number of contiguous historic days where widget count = 0

  • I tried asking this question in another forum but I'm not getting much response.

    I need to develop a query that returns the number of days a client has used zero widgets. If Client X didn't use a widget today, didn't use a widget yesterday, but used one the day before, today's return value for that client's row should be 2: one for today and one for yesterday. If we don't know if the user used a widget yesterday, than the return value should be 1. I want to avoid creating a stored procedure if at all possible, but I can't figure out how to write the query. Any help would be greatly appreciated.

    Yes, I do have a Calendar table and a Customer table that I could join to, in case it would simplify the task.

    Exit question: Would this type of query be easier to create using MDX and Analysis Services? It's certainly an option.

    Thanks,

    Eric

    Here's my test data:

    DECLARE

    @t TABLE (

    EntryDate DATE NOT NULL

    , ClientID INT NOT NULL

    , WidgetCount INT NOT

    NULL)

    INSERT

    INTO @t (EntryDate, ClientID, WidgetCount) VALUES

    ('6/1/2010', 1, 3)

    , ('6/2/2010', 1, 0)

    , ('6/3/2010', 1, 0)

    , ('6/4/2010', 1, 2)

    , ('6/5/2010', 1, 0)

    , ('6/6/2010', 1, 0)

    , ('6/7/2010', 1, 0)

    , ('6/9/2010', 1, 0) --Notice a day is missing, therefore should be treated as a non-zero value.

    Here's my expected query results:

    SELECT

    1 AS ClientID, '2010-06-01' AS EntryDate, 0 AS DaysOfZeroWidgets UNION ALL

    SELECT 1, '2010-06-02', 1 UNION ALL

    SELECT 1, '2010-06-03', 2 UNION ALL

    SELECT 1, '2010-06-04', 0 UNION ALL

    SELECT 1, '2010-06-05', 1 UNION ALL

    SELECT 1, '2010-06-06', 2 UNION ALL

    SELECT 1, '2010-06-07', 3 UNION ALL

    SELECT 1, '2010-06-08', 0 UNION ALL

    SELECT 1, '2010-06-09', 1

  • This isn't the *prettiest* code, but it functions. I modified the source data to include two clientID's to prepare the code for that. Interesting challenge. I'm hoping someone can come by and clean up the massive subselect I'm doing to locate the islanding. I'm triangle joining all over the place here, though the Cross Apply does help some.

    Comments are in the code, what few there are. Let me know if there's any confusion on any point in here. You can probably skip the rownumber component and go by date instead of rownumber, but I left the first piece of this in there in case someone wants to try to get fancy with a row_number() - row_number() style type of grouping attempt. I tried a few different ideas I had (as you can tell by the commented entry in cte2), but couldn't really nail it down.

    DECLARE @t TABLE (

    EntryDate DATETIME NOT NULL --Sorry, I'm in 2k5

    , ClientID INT NOT NULL

    , WidgetCount INT NOT

    NULL)

    INSERT INTO @t (EntryDate, ClientID, WidgetCount)

    SELECT '6/1/2010', 1, 3 UNION ALL

    SELECT '6/2/2010', 1, 0 UNION ALL

    SELECT '6/3/2010', 1, 0 UNION ALL

    SELECT '6/4/2010', 1, 2 UNION ALL

    SELECT '6/5/2010', 1, 0 UNION ALL

    SELECT '6/6/2010', 1, 0 UNION ALL

    SELECT '6/7/2010', 1, 0 UNION ALL

    SELECT '6/9/2010', 1, 0 UNION ALL

    SELECT '6/1/2010', 2, 3 UNION ALL

    SELECT '6/2/2010', 2, 0 UNION ALL

    SELECT '6/3/2010', 2, 0 UNION ALL

    SELECT '6/4/2010', 2, 2 UNION ALL

    SELECT '6/5/2010', 2, 0 UNION ALL

    SELECT '6/6/2010', 2, 0 UNION ALL

    SELECT '6/7/2010', 2, 0 UNION ALL

    SELECT '6/9/2010', 2, 0 --Notice a day is missing, therefore should be treated as a non-zero value.

    DECLARE @Cal TABLE (CalendarDate DATETIME NOT NULL)

    INSERT INTO @Cal

    SELECT '6/1/2010' UNION ALL

    SELECT '6/2/2010' UNION ALL

    SELECT '6/3/2010' UNION ALL

    SELECT '6/4/2010' UNION ALL

    SELECT '6/5/2010' UNION ALL

    SELECT '6/6/2010' UNION ALL

    SELECT '6/7/2010' UNION ALL

    SELECT '6/8/2010' UNION ALL --Note inclusion in the Calendar Table

    SELECT '6/9/2010'

    ;WITH cte AS

    (SELECT

    c.CalendarDate AS cd,

    -- This will fill in missing entries

    COALESCE( t.clientID, cli.clientID ) AS ClientID,

    COALESCE( t.WidgetCount, -1) AS WidgetCount -- -1 to be both non-zero and obvious that it's a placeholder value.

    from

    (SELECT DISTINCT ClientID FROM @t) AS cli

    CROSS JOIN

    @Cal AS c

    LEFT JOIN

    @t AS t

    ONc.CalendarDate = t.EntryDate

    AND cli.ClientID = t.ClientID

    )

    -- Now that the data's cleaned up, we need a way to find islands.

    ,cte2 AS

    (SELECT

    cd,

    ClientID,

    WidgetCount,

    ROW_NUMBER() OVER (PARTITION BY ClientID ORDER BY cd) AS rn

    --,ROW_NUMBER() OVER (PARTITION BY ClientID ORDER BY cd) - CASE WHEN WidgetCount = 0 THEN ROW_NUMBER() OVER (ORDER BY ClientID, cd) ELSE 0 END AS rn2

    FROM

    cte

    )

    SELECT

    cd,

    ClientID,

    WidgetCount,

    CASE WHEN WidgetCount <> 0 THEN 0 ELSE drv2.NumDays END AS NumDaysNotUsingWidget

    FROM

    cte2 AS c2

    CROSS APPLY

    (SELECT

    COUNT(*) AS NumDays

    FROM

    Cte2 AS drv

    WHERE

    drv.rn <= c2.rn

    AND drv.ClientID = c2.ClientID

    AND drv.rn > (SELECT MAX(rn) FROM cte2 AS i1 WHERE i1.rn < c2.rn AND WidgetCount <> 0 AND i1.ClientID = c2.ClientID)

    ) AS drv2


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Awesome, Craig, thank you very much!!! You are the only person who seems to have understood my question correctly. I'll now pick it apart so I can understand just exactly how it works. I sincerely appreciate your help.

  • freedom_nut (3/30/2011)


    Awesome, Craig, thank you very much!!! You are the only person who seems to have understood my question correctly. I'll now pick it apart so I can understand just exactly how it works. I sincerely appreciate your help.

    No problem, your description was rather concise. I don't believe it's the understanding part that hooked a lot of folks, but it's a difficult algorithm for some until you've dealt with using correlated subqueries to help control aggregation logic. It's also unusual, so some folks may have been concerned you had a misunderstanding of the original requirements.

    That said, please be aware this is NOT optimal code. It's rather chewy and I wouldn't want to run this repetitively throughout a day. I would recommend turning this into an overnight batch job to save the counts in your table, in which case an easier solution could be created to only update the null entries, possibly with a while loop in case you get a few days in a row for a customer that are nulled, simply looking back at the previous row and working off that count.

    There is probably a very good set based solution to be had, but I'm afraid I can't come up with it easily.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply