How to get a consecutive count based on the first value

Question

Post reply

How to get a consecutive count based on the first value

FarNorthDBA

SSCrazy

Points: 2075
More actions
January 13, 2015 at 12:50 pm

#397867

Hello all. I was asked for a very interesting report today that has been driving me nuts all morning. We have customer accounts that we measure usage. We want to run a report for all customers whose current usage is 0 and a count of how many months it has been zero consecutively. Here is an example.
declare @YourTable table (
CustomerID int,
ReadDate datetime,
usage int
)
insert into @YourTable select 1,' 1 mar 2014',0
insert into @YourTable select 1,' 1 feb 2014',0
insert into @YourTable select 1,' 1 jan 2014',0
insert into @YourTable select 2,' 1 mar 2014',0
insert into @YourTable select 2,' 1 feb 2014',100
insert into @YourTable select 2,' 1 jan 2014',0
insert into @YourTable select 3,' 1 mar 2014',150
insert into @YourTable select 3,' 1 feb 2014',200
insert into @YourTable select 3,' 1 jan 2014',0
This should return
1,3
2,1
This is what I am currently using but it isn't working right
WITH cte
AS
(
SELECT *,COUNT(1) OVER(PARTITION BY cnt,CustomerID) pt FROM
(
SELECT tt.*
,(SELECT COUNT(Customerid) FROM @YourTable WHERE Usage = 0 AND ReadDate < tt.ReadDate) AS cnt
FROM @YourTable tt
WHERE usage = 0
) t1
)
SELECT CustomerID,count(*) FROM cte GROUP BY Customerid
Any ideas?

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Mark Cowne One Orange Chip Points: 26989 More actions · Answer 1

This is a gaps and islands problem

http://www.sqlservercentral.com/articles/T-SQL/71550/

WITH CTE AS (

SELECT CustomerID,ReadDate,usage,

ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS rn1,

ROW_NUMBER() OVER(PARTITION BY CustomerID,CASE WHEN usage=0 THEN 0 ELSE 1 END ORDER BY ReadDate DESC) AS rn2

FROM @YourTable)

SELECT CustomerID,COUNT(*) AS cnt

FROM CTE

GROUP BY CustomerID,CASE WHEN usage=0 THEN 0 ELSE 1 END,rn1-rn2

HAVING MIN(rn1)=1 AND MIN(usage)=0

ORDER BY CustomerID,MIN(rn1);

Simpler version

WITH CTE AS (

SELECT CustomerID,ReadDate,usage,

ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS rn,

SUM(ABS(usage)) OVER(PARTITION BY CustomerID ORDER BY ReadDate DESC) AS total

FROM @YourTable)

SELECT CustomerID,MAX(rn) AS cnt

FROM CTE

WHERE total=0

GROUP BY CustomerID

ORDER BY CustomerID;

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Luis Cazares SSC Guru Points: 183706 More actions · Answer 2

Hi, here's one possibility to get that report. There are other possibilities but this just seemed easy for me.

SELECT CustomerID,

DATEDIFF( MM,

ISNULL( MAX(CASE WHEN usage > 0 THEN ReadDate END), DATEADD( MM, -1, MIN(ReadDate))),

MAX( ReadDate)) Option1,

ISNULL( DATEDIFF( MM, MAX(CASE WHEN usage > 0 THEN ReadDate END), MAX( ReadDate))

,DATEDIFF( MM, DATEADD( MM, -1, MIN(ReadDate)), MAX( ReadDate))) Option2

FROM @YourTable

WHERE ReadDate <= @Date

GROUP BY CustomerID

HAVING MAX( ReadDate) = MAX(CASE WHEN usage = 0 THEN ReadDate END)

If you have any questions, feel free to ask them.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Stephanie Giovannini SSCertifiable Points: 7422 More actions · Answer 3

I'm betting this could be more elegant and possibly use fewer CTEs, but I think it works.

DECLARE @YourTable TABLE (

CustomerID int,

ReadDate datetime,

usage int)

insert into @YourTable select 1,' 1 mar 2014',0

insert into @YourTable select 1,' 1 feb 2014',0

insert into @YourTable select 1,' 1 jan 2014',0

insert into @YourTable select 2,' 1 mar 2014',0

insert into @YourTable select 2,' 1 feb 2014',100

insert into @YourTable select 2,' 1 jan 2014',0

insert into @YourTable select 3,' 1 mar 2014',150

insert into @YourTable select 3,' 1 feb 2014',200

insert into @YourTable select 3,' 1 jan 2014',0;

WITH

lastReadDateForCustomer AS

(SELECT CustomerId, MAX(ReadDate) LastReadDate

FROM @YourTable

GROUP BY CustomerID),

customerWithLastUsageZero AS

(SELECT t.CustomerId

FROM @YourTable t

JOIN lastReadDateForCustomer r

ON r.CustomerID = t.CustomerID

AND r.LastReadDate = t.ReadDate

WHERE t.usage = 0),

lastNonZeroUsage AS

(SELECT z.CustomerId, MAX(ReadDate) NonZeroDate

FROM customerWithLastUsageZero z

LEFT JOIN @YourTable t

ON z.CustomerID = t.CustomerID

AND t.usage > 0

GROUP BY z.CustomerID),

noUsage AS

(SELECT z.CustomerId, MIN(ReadDate) EarliestZeroDate

FROM customerWithLastUsageZero z

JOIN @YourTable t

ON z.CustomerID = t.CustomerID

WHERE t.usage = 0

GROUP BY z.CustomerID)

SELECT n.CustomerID, DATEDIFF(MM, COALESCE(n.NonZeroDate, DATEADD(mm, -1, nu.EarliestZeroDate)), r.LastReadDate) as ZeroMonths

FROM lastNonZeroUsage n

JOIN lastReadDateForCustomer r

ON r.CustomerID = n.CustomerID

JOIN noUsage nu

ON nu.CustomerID = n.CustomerID

FarNorthDBA SSCrazy Points: 2075 More actions · Answer 4

Thanks Stephanie this looks perfect. I'm going to test it with our full dataset and see how it does. I'm sure you are right and it could be condensed down but it is very readable and easy to follow in its current form.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

FarNorthDBA SSCrazy Points: 2075 More actions · Answer 6

Well I ran it on our production table with 1.2 million rows and it runs in 260 ms so I think that will work just fine 🙂

Stephanie Giovannini SSCertifiable Points: 7422 More actions · Answer 7

Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.

Agreed. The other two candidate queries are more efficient. They were posted while I was writing mine. Since I am primarily an application developer, I tend to overuse CTEs, which lend themselves to sequential processing.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 8

Stephanie Giovannini (1/14/2015)
Even if it's readable, I wouldn't suggest using Stephanie's code. It reads the table 7 times which can become a nightmare on large tables.
Agreed. The other two candidate queries are more efficient. They were posted while I was writing mine. Since I am primarily an application developer, I tend to overuse CTEs, which lend themselves to sequential processing.

The only truly efficient query is Luis Cazares's which need a single scan and mimimal number of logical reads.

Another one which does ROW_NUMBER twice, will have no less number of scans than yours (even more, actually much more), the only difference is : they will be performed on Work Table instead of on source one.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

Mark Cowne (1/13/2015)
This is a gaps and islands problem

Correct!

My solution suggestion:

SELECT CustomerID, COUNT(*)

FROM

(

SELECT CustomerID, ReadDate, Usage

,rn=DATEDIFF(month, ReadDate, 0) -

ROW_NUMBER() OVER

(

PARTITION BY CustomerID

ORDER BY ReadDate DESC

)

FROM @YourTable

WHERE usage = 0

) a

GROUP BY CustomerID, rn

HAVING MAX(ReadDate) = (SELECT MAX(ReadDate) FROM @YourTable);

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

Luis Cazares (1/13/2015)
Hi, here's one possibility to get that report. There are other possibilities but this just seemed easy for me.
SELECT CustomerID,
DATEDIFF( MM,
ISNULL( MAX(CASE WHEN usage > 0 THEN ReadDate END), DATEADD( MM, -1, MIN(ReadDate))),
MAX( ReadDate)) Option1,
ISNULL( DATEDIFF( MM, MAX(CASE WHEN usage > 0 THEN ReadDate END), MAX( ReadDate))
,DATEDIFF( MM, DATEADD( MM, -1, MIN(ReadDate)), MAX( ReadDate))) Option2
FROM @YourTable
WHERE ReadDate <= @Date
GROUP BY CustomerID
HAVING MAX( ReadDate) = MAX(CASE WHEN usage = 0 THEN ReadDate END)
If you have any questions, feel free to ask them.

I assume you are setting @Date like this?

DECLARE @Date DATE = ' 1 mar 2014';

You could do that in mine also instead of doing a second clustered index scan to find the max date in the table. Assuming of course that the table definition is like this:

declare @YourTable table (

CustomerID int,

ReadDate datetime,

usage int,

PRIMARY KEY(CustomerID, ReadDate)

);

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Luis Cazares SSC Guru Points: 183706 More actions · Answer 11

Yes and no.

I declare the variable like that but it could be any date to get historic reports. I believe that I tested it with a simple GETDATE().

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2