March 29, 2011 at 4:14 pm
I tried asking this question in another forum but I'm not getting much response.
I need to develop a query that returns the number of days a client has used zero widgets. If Client X didn't use a widget today, didn't use a widget yesterday, but used one the day before, today's return value for that client's row should be 2: one for today and one for yesterday. If we don't know if the user used a widget yesterday, than the return value should be 1. I want to avoid creating a stored procedure if at all possible, but I can't figure out how to write the query. Any help would be greatly appreciated.
Yes, I do have a Calendar table and a Customer table that I could join to, in case it would simplify the task.
Exit question: Would this type of query be easier to create using MDX and Analysis Services? It's certainly an option.
Thanks,
Eric
Here's my test data:
DECLARE
@t TABLE (
EntryDate DATE NOT NULL
, ClientID INT NOT NULL
, WidgetCount INT NOT
NULL)
INSERT
INTO @t (EntryDate, ClientID, WidgetCount) VALUES
('6/1/2010', 1, 3)
, ('6/2/2010', 1, 0)
, ('6/3/2010', 1, 0)
, ('6/4/2010', 1, 2)
, ('6/5/2010', 1, 0)
, ('6/6/2010', 1, 0)
, ('6/7/2010', 1, 0)
, ('6/9/2010', 1, 0) --Notice a day is missing, therefore should be treated as a non-zero value.
Here's my expected query results:
SELECT
1 AS ClientID, '2010-06-01' AS EntryDate, 0 AS DaysOfZeroWidgets UNION ALL
SELECT 1, '2010-06-02', 1 UNION ALL
SELECT 1, '2010-06-03', 2 UNION ALL
SELECT 1, '2010-06-04', 0 UNION ALL
SELECT 1, '2010-06-05', 1 UNION ALL
SELECT 1, '2010-06-06', 2 UNION ALL
SELECT 1, '2010-06-07', 3 UNION ALL
SELECT 1, '2010-06-08', 0 UNION ALL
SELECT 1, '2010-06-09', 1
March 29, 2011 at 4:50 pm
This isn't the *prettiest* code, but it functions. I modified the source data to include two clientID's to prepare the code for that. Interesting challenge. I'm hoping someone can come by and clean up the massive subselect I'm doing to locate the islanding. I'm triangle joining all over the place here, though the Cross Apply does help some.
Comments are in the code, what few there are. Let me know if there's any confusion on any point in here. You can probably skip the rownumber component and go by date instead of rownumber, but I left the first piece of this in there in case someone wants to try to get fancy with a row_number() - row_number() style type of grouping attempt. I tried a few different ideas I had (as you can tell by the commented entry in cte2), but couldn't really nail it down.
DECLARE @t TABLE (
EntryDate DATETIME NOT NULL --Sorry, I'm in 2k5
, ClientID INT NOT NULL
, WidgetCount INT NOT
NULL)
INSERT INTO @t (EntryDate, ClientID, WidgetCount)
SELECT '6/1/2010', 1, 3 UNION ALL
SELECT '6/2/2010', 1, 0 UNION ALL
SELECT '6/3/2010', 1, 0 UNION ALL
SELECT '6/4/2010', 1, 2 UNION ALL
SELECT '6/5/2010', 1, 0 UNION ALL
SELECT '6/6/2010', 1, 0 UNION ALL
SELECT '6/7/2010', 1, 0 UNION ALL
SELECT '6/9/2010', 1, 0 UNION ALL
SELECT '6/1/2010', 2, 3 UNION ALL
SELECT '6/2/2010', 2, 0 UNION ALL
SELECT '6/3/2010', 2, 0 UNION ALL
SELECT '6/4/2010', 2, 2 UNION ALL
SELECT '6/5/2010', 2, 0 UNION ALL
SELECT '6/6/2010', 2, 0 UNION ALL
SELECT '6/7/2010', 2, 0 UNION ALL
SELECT '6/9/2010', 2, 0 --Notice a day is missing, therefore should be treated as a non-zero value.
DECLARE @Cal TABLE (CalendarDate DATETIME NOT NULL)
INSERT INTO @Cal
SELECT '6/1/2010' UNION ALL
SELECT '6/2/2010' UNION ALL
SELECT '6/3/2010' UNION ALL
SELECT '6/4/2010' UNION ALL
SELECT '6/5/2010' UNION ALL
SELECT '6/6/2010' UNION ALL
SELECT '6/7/2010' UNION ALL
SELECT '6/8/2010' UNION ALL --Note inclusion in the Calendar Table
SELECT '6/9/2010'
;WITH cte AS
(SELECT
c.CalendarDate AS cd,
-- This will fill in missing entries
COALESCE( t.clientID, cli.clientID ) AS ClientID,
COALESCE( t.WidgetCount, -1) AS WidgetCount -- -1 to be both non-zero and obvious that it's a placeholder value.
from
(SELECT DISTINCT ClientID FROM @t) AS cli
CROSS JOIN
@Cal AS c
LEFT JOIN
@t AS t
ONc.CalendarDate = t.EntryDate
AND cli.ClientID = t.ClientID
)
-- Now that the data's cleaned up, we need a way to find islands.
,cte2 AS
(SELECT
cd,
ClientID,
WidgetCount,
ROW_NUMBER() OVER (PARTITION BY ClientID ORDER BY cd) AS rn
--,ROW_NUMBER() OVER (PARTITION BY ClientID ORDER BY cd) - CASE WHEN WidgetCount = 0 THEN ROW_NUMBER() OVER (ORDER BY ClientID, cd) ELSE 0 END AS rn2
FROM
cte
)
SELECT
cd,
ClientID,
WidgetCount,
CASE WHEN WidgetCount <> 0 THEN 0 ELSE drv2.NumDays END AS NumDaysNotUsingWidget
FROM
cte2 AS c2
CROSS APPLY
(SELECT
COUNT(*) AS NumDays
FROM
Cte2 AS drv
WHERE
drv.rn <= c2.rn
AND drv.ClientID = c2.ClientID
AND drv.rn > (SELECT MAX(rn) FROM cte2 AS i1 WHERE i1.rn < c2.rn AND WidgetCount <> 0 AND i1.ClientID = c2.ClientID)
) AS drv2
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 30, 2011 at 10:27 am
Awesome, Craig, thank you very much!!! You are the only person who seems to have understood my question correctly. I'll now pick it apart so I can understand just exactly how it works. I sincerely appreciate your help.
March 30, 2011 at 11:56 am
freedom_nut (3/30/2011)
Awesome, Craig, thank you very much!!! You are the only person who seems to have understood my question correctly. I'll now pick it apart so I can understand just exactly how it works. I sincerely appreciate your help.
No problem, your description was rather concise. I don't believe it's the understanding part that hooked a lot of folks, but it's a difficult algorithm for some until you've dealt with using correlated subqueries to help control aggregation logic. It's also unusual, so some folks may have been concerned you had a misunderstanding of the original requirements.
That said, please be aware this is NOT optimal code. It's rather chewy and I wouldn't want to run this repetitively throughout a day. I would recommend turning this into an overnight batch job to save the counts in your table, in which case an easier solution could be created to only update the null entries, possibly with a while loop in case you get a few days in a row for a customer that are nulled, simply looking back at the previous row and working off that count.
There is probably a very good set based solution to be had, but I'm afraid I can't come up with it easily.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply