November 26, 2013 at 2:31 pm
How to create a row number for a consecutive action. Example: I have a listing of people who have either completed a goal or not. I need to count by person the number of consecutively missed goals.
My sql table is this:
PersonId, GoalDate, GoalStatus (holds completed or missed)
My first thought was to use the rownumber function however that doesn’t work because someone could complete a goal, miss a goal, then complete one and when they complete a goal after a missed goal the count has to start over.
Thanks in advance for any advice.
November 26, 2013 at 2:33 pm
Hi and welcome to the forums. In order to help we will need a few things:
1. Sample DDL in the form of CREATE TABLE statements
2. Sample data in the form of INSERT INTO statements
3. Expected results based on the sample data
Please take a few minutes and read the first article in my signature for best practices when posting questions.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
November 26, 2013 at 2:58 pm
Sorry about that. Here ya go;
GoalStatus; 1= Success, 0=UnSuccessfully
Create Table #Goals (
PersonId INT
, GoalDate Datetime
, GoalStatus INT
)
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/10/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/12/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/14/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/15/2013','1')
Expected Results: I am trying to get the consective count to be correct.
PersonId GoalDateGoalStatusConsecutiveCount
1256002013-01-10 00:00:00.00011
1256002013-01-12 00:00:00.00001
1256002013-01-14 00:00:00.00011
1256002013-01-15 00:00:00.00012
November 26, 2013 at 5:19 pm
csallen_01 (11/26/2013)
Sorry about that. Here ya go;GoalStatus; 1= Success, 0=UnSuccessfully
Create Table #Goals (
PersonId INT
, GoalDate Datetime
, GoalStatus INT
)
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/10/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/12/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/14/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/15/2013','1')
Expected Results: I am trying to get the consective count to be correct.
PersonId GoalDateGoalStatusConsecutiveCount
1256002013-01-10 00:00:00.00011
1256002013-01-12 00:00:00.00001
1256002013-01-14 00:00:00.00011
1256002013-01-15 00:00:00.00012
Does the next record (if GoalStatus=0) retain the ConsecutiveCount of 2?
In other words, moving forward does ConsecutiveCount always reflect the maximum number of consecutive goals obtained?
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 26, 2013 at 6:46 pm
It will be helpful if you can include more sample data.
Since you are using SS 2012, one idea could be to calculate the running total of GoalStatus = 0 by PersonId in order of GoalDate. This could help us to identify groups of consecutive rows with specific GoalStatus and the rest will be to enumerate rows by (PersonId, grp).
WITH C1 AS (
SELECT
PersonId,
GoalDate,
GoalStatus,
SUM(CASE WHEN GoalStatus = 0 THEN 1 ELSE 0 END) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) - GoalStatus AS grp
FROM
#Goals
)
SELECT
PersonId,
GoalDate,
GoalStatus,
ROW_NUMBER() OVER(
PARTITION BY PersonId, grp
ORDER BY GoalDate
) AS rn
FROM
C1
ORDER BY
PersonId,
GoalDate;
GO
Since there is no primary key in this table, I am assuming that we do not have duplicate entries by (PersonId, GoalDate).
November 26, 2013 at 6:53 pm
No, if the next record is GoalStatus=0 the consecutiveCount goes to 0. Then the next record where GoalStatus=1 the consecutivecount goes to 1.
November 26, 2013 at 7:08 pm
csallen_01 (11/26/2013)
No, if the next record is GoalStatus=0 the consecutiveCount goes to 0. Then the next record where GoalStatus=1 the consecutivecount goes to 1.
This strikes me as inconsistent with your initial expected results, as the second row (where GoalStatus=0) has the ConsecutiveCount=1. Shouldn't that also be 0?
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 26, 2013 at 7:25 pm
Here is some additional sample data
Create Table #Goals (
PersonId INT
, GoalDate Datetime
, GoalStatus INT
)
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/10/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/12/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/14/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/15/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/20/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/20/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/22/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/25/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/28/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/30/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','2/01/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','2/10/2013','1')
PersonIdGoalDateGoalStatusConsecutiveCount
12560001/10/131 1
12560001/12/130 0
12560001/14/131 1
12560001/15/131 2
12560001/20/131 3
10080001/20/131 1
10080001/22/131 2
10080001/25/130 0
10080001/28/131 1
10080001/30/131 2
10080002/01/130 0
10080002/10/131 1
November 26, 2013 at 7:36 pm
Sorry for the confusion. Here is additional sample data;
Create Table #Goals (
PersonId INT
, GoalDate Datetime
, GoalStatus INT
)
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/10/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/12/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/14/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/15/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('125600','1/20/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/20/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/22/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/25/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/28/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','1/30/2013','1')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','2/01/2013','0')
INSERT INTO #Goals (PersonId, GoalDate, GoalStatus) VALUES ('100800','2/10/2013','1')
PersonIdGoalDateGoalStatusConsecutiveCount
12560001/10/131 1
12560001/12/130 0
12560001/14/131 1
12560001/15/131 2
12560001/20/131 3
10080001/20/131 1
10080001/22/131 2
10080001/25/130 0
10080001/28/131 1
10080001/30/131 2
10080002/01/130 0
10080002/10/131 1
November 26, 2013 at 7:43 pm
The idea is similar but I hanged it a little bit to get the custom enumeration.
Calculate the running total of GoalStatus = 0 by PersonId ordered by GoalDate. Calculate the minimum GoalStatus in the same window to adjust the enumeration.
WITH C1 AS (
SELECT
PersonId,
GoalDate,
GoalStatus,
SUM(CASE WHEN GoalStatus = 0 THEN 1 ELSE 0 END) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS grp,
MIN(GoalStatus) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS min_GoalStatus
FROM
#Goals
)
SELECT
PersonId,
GoalDate,
GoalStatus,
ROW_NUMBER() OVER(
PARTITION BY PersonId, grp
ORDER BY GoalDate
) - (1 - min_GoalStatus) AS rn
FROM
C1
ORDER BY
PersonId,
GoalDate;
November 26, 2013 at 8:08 pm
Thank you so much that looks like it works!
November 26, 2013 at 8:26 pm
You can also do it with a Quirky Update (QU) but you'd need to add a PRIMARY KEY and an additional column to your #Goals table:
CREATE TABLE #Goals (
PersonId INT
,GoalDate Datetime
,GoalStatus INT
,ConsecutiveGoals INT
,PRIMARY KEY (PersonID, GoalDate)
);
The QU is then applied thusly:
DECLARE @PersonID INT = 0
,@ConsecutiveGoals INT = 0;
UPDATE #Goals WITH(TABLOCKX)
SET @ConsecutiveGoals =
CASE
WHEN @PersonID <> PersonID THEN GoalStatus
WHEN GoalStatus = 0 THEN 0
ELSE @ConsecutiveGoals + GoalStatus END
,ConsecutiveGoals = @ConsecutiveGoals
,@PersonID = PersonID
OPTION(MAXDOP 1);
SELECT *
FROM #Goals;
The rules for the QU can be read about here: Solving the Running Total and Ordinal Rank Problems[/url].
And since this is a running totals problem (similar in nature to calculating values in a rolling window), you can take a look at this article to get an expectation of performance using the SQL window frame suggested by hunchback.
Calculating Values within a Rolling Window in Transact SQL [/url]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 26, 2013 at 8:55 pm
hunchback (11/26/2013)
The idea is similar but I hanged it a little bit to get the custom enumeration.Calculate the running total of GoalStatus = 0 by PersonId ordered by GoalDate. Calculate the minimum GoalStatus in the same window to adjust the enumeration.
WITH C1 AS (
SELECT
PersonId,
GoalDate,
GoalStatus,
SUM(CASE WHEN GoalStatus = 0 THEN 1 ELSE 0 END) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS grp,
MIN(GoalStatus) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS min_GoalStatus
FROM
#Goals
)
SELECT
PersonId,
GoalDate,
GoalStatus,
ROW_NUMBER() OVER(
PARTITION BY PersonId, grp
ORDER BY GoalDate
) - (1 - min_GoalStatus) AS rn
FROM
C1
ORDER BY
PersonId,
GoalDate;
Hunchie - Nice solution but I think it can be simplified a little.
WITH C1 AS (
SELECT
PersonId,
GoalDate,
GoalStatus,
SUM(1-GoalStatus) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
-- Below is default so not needed either
-- ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS grp
FROM
#Goals
)
SELECT *
,ConsecutiveGoals=SUM(GoalStatus) OVER (PARTITION BY PersonID, grp ORDER BY GoalDate)
FROM C1
ORDER BY PersonId, GoalDate;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 27, 2013 at 7:13 am
Dwain, good thinking there.
SELECT
PersonId,
GoalDate,
GoalStatus,
SUM(1-GoalStatus) OVER(
PARTITION BY PersonId
ORDER BY GoalDate
-- Below is default so not needed either
-- ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS grp
...
Your comment about the default for frame is not quiet right. If you specify the ORDER BY subclause but not frame in a function that can accept optional ROWS/RANGE specification then the default will be "RANGE UNBOUNDED PRECEDING AND CURRENT ROW" which have a different treatment if we have ties on the columns used in the ORDER BY subclause. You can corroborate this from BOL.
Check this script and notice that c3 and c4 differ in the result. This is because RANGE is the default and the meaning is different than ROWS mainly when there are ties by the columns used in the ORDER BY subclause.
DECLARE @T TABLE (
sk int NOT NULL IDENTITY(1, 1) PRIMARY KEY,
c1 int NOT NULL,
c2 int NOT NULL
);
INSERT INTO @T (c1, c2)
VALUES (1, 1), (1, 2), (1, 3);
SELECT
sk,
c1,
c2,
SUM(c2) OVER(
ORDER BY c1
) AS c3,
SUM(c2) OVER(
ORDER BY c1
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS c4
FROM
@T
ORDER BY
sk;
GO
November 27, 2013 at 7:34 am
Thank you all so much!
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply