October 19, 2017 at 2:07 am
Hi,
I have to count how many times a record changed from group 2 to group 1 or 3 per ID ordered by RowID
This is my table:
RowID ID GROUP
1 1 1
2 1 2
3 1 1
4 1 2
5 1 3
6 1 2
7 1 4
8 1 5
9 1 2
1 2 2
2 2 1
3 2 3
The end table should look like this:
RowID ID GROUP
3 1 1
5 1 3
2 2 1
This is where I'm trying to explain this a bit more:
So for all ID 1 records check how many times it was in group 2 then if it was in group 2 then check if the next record (based on the RowID) is in group 1 or 3 ignore rest, keep all the records where it changed from group 2 to 1 or 3
and do the same for ID 2,3,4 and so on.....
There must be a simple way to do this I just can't figure it out
Please help
Thank you
October 19, 2017 at 2:55 am
I figured it out 😀 I just leave the code here for future generations
SELECT B1.*
FROM
(
SELECT *,ROW_NUMBER() over(partition by B1.ID order by B1.RowID desc) as P_count
FROM #tempbase B1
) B1
LEFT JOIN
(
SELECT *,ROW_NUMBER() over(partition by B2.ID order by B2.RowID desc) as P_count2
FROM #tempbase B2
) B2
ON B1.ID = B2.ID
AND B2.Group = 2
AND B1.Group IN(1,3)
AND B1.P_count = B2.P_count2-1
WHERE B2.ID IS NOT NULL
October 19, 2017 at 8:48 am
The problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.
;
WITH Chg AS
(
SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup
FROM #tempbase
)
SELECT RowID, ID, Grp
FROM Chg
WHERE Grp IN (1,3)
AND PrevGroup = 2
;
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 19, 2017 at 1:54 pm
drew.allen - Thursday, October 19, 2017 8:48 AMThe problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.
;
WITH Chg AS
(
SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup
FROM #tempbase
)
SELECT RowID, ID, Grp
FROM Chg
WHERE Grp IN (1,3)
AND PrevGroup = 2
;Drew
I recommend always being fully expressive in code and not relying on defaults. So flesh out that LAG so everyone that reads/debugs/etc the code knows explicitly what it is doing.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
October 19, 2017 at 2:54 pm
TheSQLGuru - Thursday, October 19, 2017 1:54 PMdrew.allen - Thursday, October 19, 2017 8:48 AMThe problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.
;
WITH Chg AS
(
SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup
FROM #tempbase
)
SELECT RowID, ID, Grp
FROM Chg
WHERE Grp IN (1,3)
AND PrevGroup = 2
;Drew
I recommend always being fully expressive in code and not relying on defaults. So flesh out that LAG so everyone that reads/debugs/etc the code knows explicitly what it is doing.
I don't think there is a right or wrong here. There are merits to both approaches. I prefer leaving out the defaults if they match what I would use anyhow, because it's less typing. The only time that I have seen an issue with using the defaults is in the frame for windowed functions, because many people are unaware of when a default frame is applied.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 19, 2017 at 3:25 pm
I've got to go with Kevin on this one... My soul dies a little every time I review a script that has a CREATE TABLE script that looks like this...
CREATE TABLE ImAnAss (
ID INT IDENTITY PRIMARY KEY,
SomeColumn VARCHAR,
AnotherColumn DECIMAL DEFAULT (0),
TheDate DATETIME DEFAULT (GETDATE())
);
October 20, 2017 at 9:53 am
Jason A. Long - Thursday, October 19, 2017 3:25 PMI've got to go with Kevin on this one... My soul dies a little every time I review a script that has a CREATE TABLE script that looks like this...
CREATE TABLE ImAnAss (
ID INT IDENTITY PRIMARY KEY,
SomeColumn VARCHAR,
AnotherColumn DECIMAL DEFAULT (0),
TheDate DATETIME DEFAULT (GETDATE())
);
I think that tables and variables are different, because they should be designed to match the expected data. Expected data varies too widely to rely on any defaults. I also like naming my constraints so that they're meaningful, which means that they would be different from the default names provided.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 20, 2017 at 11:57 am
The most important thing you said there Drew was "...should be...". But sadly they aren't. Probably 99% of all columns I have ever seen in SQL Server applications (25 years now) are NULLable (because that is the default), and the problems I have seen because of that are innumerable.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
October 20, 2017 at 1:17 pm
TheSQLGuru - Friday, October 20, 2017 11:57 AMThe most important thing you said there Drew was "...should be...". But sadly they aren't. Probably 99% of all columns I have ever seen in SQL Server applications (25 years now) are NULLable (because that is the default), and the problems I have seen because of that are innumerable.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply