January 13, 2015 at 12:50 pm
Hello all. I was asked for a very interesting report today that has been driving me nuts all morning. We have customer accounts that we measure usage. We want to run a report for all customers whose current usage is 0 and a count of how many months it has been zero consecutively. Here is an example.
declare @YourTable table (
CustomerID int,
ReadDate datetime,
usage int
)
insert into @YourTable select 1,' 1 mar 2014',0
insert into @YourTable select 1,' 1 feb 2014',0
insert into @YourTable select 1,' 1 jan 2014',0
insert into @YourTable select 2,' 1 mar 2014',0
insert into @YourTable select 2,' 1 feb 2014',100
insert into @YourTable select 2,' 1 jan 2014',0
insert into @YourTable select 3,' 1 mar 2014',150
insert into @YourTable select 3,' 1 feb 2014',200
insert into @YourTable select 3,' 1 jan 2014',0
This should return
1,3
2,1
This is what I am currently using but it isn't working right
WITH cte
AS
(
SELECT *,COUNT(1) OVER(PARTITION BY cnt,CustomerID) pt FROM
(
SELECT tt.*
,(SELECT COUNT(Customerid) FROM @YourTable WHERE Usage = 0 AND ReadDate < tt.ReadDate) AS cnt
FROM @YourTable tt
WHERE usage = 0
) t1
)
SELECT CustomerID,count(*) FROM cte GROUP BY Customerid
Any ideas?
January 13, 2015 at 1:35 pm
This is a gaps and islands problem
http://www.sqlservercentral.com/articles/T-SQL/71550/
WITH CTE AS (
SELECT CustomerID,ReadDate,usage,
ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS rn1,
ROW_NUMBER() OVER(PARTITION BY CustomerID,CASE WHEN usage=0 THEN 0 ELSE 1 END ORDER BY ReadDate DESC) AS rn2
FROM @YourTable)
SELECT CustomerID,COUNT(*) AS cnt
FROM CTE
GROUP BY CustomerID,CASE WHEN usage=0 THEN 0 ELSE 1 END,rn1-rn2
HAVING MIN(rn1)=1 AND MIN(usage)=0
ORDER BY CustomerID,MIN(rn1);
Simpler version
WITH CTE AS (
SELECT CustomerID,ReadDate,usage,
ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS rn,
SUM(ABS(usage)) OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS total
FROM @YourTable)
SELECT CustomerID,MAX(rn) AS cnt
FROM CTE
WHERE total=0
GROUP BY CustomerID
ORDER BY CustomerID;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537January 13, 2015 at 1:35 pm
Hi, here's one possibility to get that report. There are other possibilities but this just seemed easy for me.
SELECT CustomerID,
DATEDIFF( MM,
ISNULL( MAX(CASE WHEN usage > 0 THEN ReadDate END), DATEADD( MM, -1, MIN(ReadDate))),
MAX( ReadDate)) Option1,
ISNULL( DATEDIFF( MM, MAX(CASE WHEN usage > 0 THEN ReadDate END), MAX( ReadDate))
,DATEDIFF( MM, DATEADD( MM, -1, MIN(ReadDate)), MAX( ReadDate))) Option2
FROM @YourTable
WHERE ReadDate <= @Date
GROUP BY CustomerID
HAVING MAX( ReadDate) = MAX(CASE WHEN usage = 0 THEN ReadDate END)
If you have any questions, feel free to ask them.
January 13, 2015 at 1:44 pm
I'm betting this could be more elegant and possibly use fewer CTEs, but I think it works.
DECLARE @YourTable TABLE (
CustomerID int,
ReadDate datetime,
usage int)
insert into @YourTable select 1,' 1 mar 2014',0
insert into @YourTable select 1,' 1 feb 2014',0
insert into @YourTable select 1,' 1 jan 2014',0
insert into @YourTable select 2,' 1 mar 2014',0
insert into @YourTable select 2,' 1 feb 2014',100
insert into @YourTable select 2,' 1 jan 2014',0
insert into @YourTable select 3,' 1 mar 2014',150
insert into @YourTable select 3,' 1 feb 2014',200
insert into @YourTable select 3,' 1 jan 2014',0;
WITH
lastReadDateForCustomer AS
(SELECT CustomerId, MAX(ReadDate) LastReadDate
FROM @YourTable
GROUP BY CustomerID),
customerWithLastUsageZero AS
(SELECT t.CustomerId
FROM @YourTable t
JOIN lastReadDateForCustomer r
ON r.CustomerID = t.CustomerID
AND r.LastReadDate = t.ReadDate
WHERE t.usage = 0),
lastNonZeroUsage AS
(SELECT z.CustomerId, MAX(ReadDate) NonZeroDate
FROM customerWithLastUsageZero z
LEFT JOIN @YourTable t
ON z.CustomerID = t.CustomerID
AND t.usage > 0
GROUP BY z.CustomerID),
noUsage AS
(SELECT z.CustomerId, MIN(ReadDate) EarliestZeroDate
FROM customerWithLastUsageZero z
JOIN @YourTable t
ON z.CustomerID = t.CustomerID
WHERE t.usage = 0
GROUP BY z.CustomerID)
SELECT n.CustomerID, DATEDIFF(MM, COALESCE(n.NonZeroDate, DATEADD(mm, -1, nu.EarliestZeroDate)), r.LastReadDate) as ZeroMonths
FROM lastNonZeroUsage n
JOIN lastReadDateForCustomer r
ON r.CustomerID = n.CustomerID
JOIN noUsage nu
ON nu.CustomerID = n.CustomerID
January 13, 2015 at 1:54 pm
Thanks Stephanie this looks perfect. I'm going to test it with our full dataset and see how it does. I'm sure you are right and it could be condensed down but it is very readable and easy to follow in its current form.
January 13, 2015 at 2:19 pm
Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.
January 13, 2015 at 3:35 pm
Well I ran it on our production table with 1.2 million rows and it runs in 260 ms so I think that will work just fine 🙂
January 14, 2015 at 8:11 am
Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.
Agreed. The other two candidate queries are more efficient. They were posted while I was writing mine. Since I am primarily an application developer, I tend to overuse CTEs, which lend themselves to sequential processing.
January 14, 2015 at 8:56 am
Stephanie Giovannini (1/14/2015)
Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.
Agreed. The other two candidate queries are more efficient. They were posted while I was writing mine. Since I am primarily an application developer, I tend to overuse CTEs, which lend themselves to sequential processing.
The only truly efficient query is Luis Cazares's which need a single scan and mimimal number of logical reads.
Another one which does ROW_NUMBER twice, will have no less number of scans than yours (even more, actually much more), the only difference is : they will be performed on Work Table instead of on source one.
January 15, 2015 at 5:40 pm
Mark Cowne (1/13/2015)
This is a gaps and islands problem
Correct!
My solution suggestion:
SELECT CustomerID, COUNT(*)
FROM
(
SELECT CustomerID, ReadDate, Usage
,rn=DATEDIFF(month, ReadDate, 0) -
ROW_NUMBER() OVER
(
PARTITION BY CustomerID
ORDER BY ReadDate DESC
)
FROM @YourTable
WHERE usage = 0
) a
GROUP BY CustomerID, rn
HAVING MAX(ReadDate) = (SELECT MAX(ReadDate) FROM @YourTable);
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 15, 2015 at 5:49 pm
Luis Cazares (1/13/2015)
Hi, here's one possibility to get that report. There are other possibilities but this just seemed easy for me.
SELECT CustomerID,
DATEDIFF( MM,
ISNULL( MAX(CASE WHEN usage > 0 THEN ReadDate END), DATEADD( MM, -1, MIN(ReadDate))),
MAX( ReadDate)) Option1,
ISNULL( DATEDIFF( MM, MAX(CASE WHEN usage > 0 THEN ReadDate END), MAX( ReadDate))
,DATEDIFF( MM, DATEADD( MM, -1, MIN(ReadDate)), MAX( ReadDate))) Option2
FROM @YourTable
WHERE ReadDate <= @Date
GROUP BY CustomerID
HAVING MAX( ReadDate) = MAX(CASE WHEN usage = 0 THEN ReadDate END)
If you have any questions, feel free to ask them.
I assume you are setting @Date like this?
DECLARE @Date DATE = ' 1 mar 2014';
You could do that in mine also instead of doing a second clustered index scan to find the max date in the table. Assuming of course that the table definition is like this:
declare @YourTable table (
CustomerID int,
ReadDate datetime,
usage int,
PRIMARY KEY(CustomerID, ReadDate)
);
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 15, 2015 at 6:13 pm
Yes and no.
I declare the variable like that but it could be any date to get historic reports. I believe that I tested it with a simple GETDATE().
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply