November 26, 2024 at 6:36 pm
I have a table dbo.tblPresentationStatus (sql script attached - table script.txt)
I have to select rows where the status change is wrong. (not as per the correct flow as shown in attached image status_Flow.jpg)
From 'Review' the status for a specific presentation ID can change to either 'Approve' or 'Presentation' or 'Close'
From 'Approve' the status for a specific presentation ID can change to either 'Presentation' or 'Close'
From 'Presentation' the status for a specific presentation ID can change to only 'Close'
I want to write query to return rows with wrong status flow. So expected output as per records given in script attached should be :
StatusID | PrID | PrPrevStatus (to derive this column) | PrStatus | StatusDate
12 103 PRESENTATION APPROVE 2024-01-03 00:00:00.00'
November 26, 2024 at 7:26 pm
;WITH ctePrevPrStatus AS (
SELECT *, LAG(PrStatus) OVER(PARTITION BY PrID ORDER BY StatusID) AS PrPrevStatus
FROM dbo.tblPresentationStatus
)
SELECT *
FROM ctePrevPrStatus
WHERE
(PrPrevStatus = 'APPROVE' AND PrStatus NOT IN ('PRESENTATION', 'CLOSE')) OR
(PrPrevStatus = 'PRESENTATION' AND PrStatus NOT IN ('CLOSE'))
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 26, 2024 at 10:23 pm
ORs can be quite expensive, so I wrote up this version that uses a different method. This would probably perform better if you had a separate table that defined the flows. My initial testing showed no major difference between the two approaches, but that would probably change as the sample size grows larger.
WITH ctePrevPrStatus AS (
SELECT *, LAG(v.StatusFlowSequence, 1, 0) OVER(PARTITION BY tps.PrID ORDER BY tps.StatusID) AS PrevStatusFlowSequence
FROM #tblPresentationStatus AS tps
CROSS APPLY ( VALUES(CASE tps.PrStatus
WHEN 'Review' THEN 1
WHEN 'Approve' THEN 2
WHEN 'Presentation' THEN 3
WHEN 'Close' THEN 4
ELSE 0
END
)
) v(StatusFlowSequence)
)
SELECT *
FROM ctePrevPrStatus AS ps
WHERE ps.PrevStatusFlowSequence >= ps.StatusFlowSequence;
NOTES:
When creating samples, it's courteous to use temp tables, so that people don't have to go back and clean up afterwards.
All of the columns in your table are NULLABLE. That's probably not the best idea.
You should always qualify your column names in case you add another table that shares one or more column names.
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
November 26, 2024 at 11:38 pm
LAG, which compares rows within the window partition, is not strictly necessary imo. JOIN relates to a lookup table containing the values sequence. Also, I'm not a fan of ordering the window partition by an ID column. This query uses ORDER BY StatusDate
with rn_cte as (
select *, row_number() over (partition by PrID order by StatusDate) rn
from #tblPresentationStatus)
select *
from rn_cte rn
join (values ('Review', 1),
('Approve', 2),
('Presentation', 3),
('Close', 4)) v(PrStatus, n) on rn.PrStatus=v.PrStatus
and rn.rn>v.n;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
November 27, 2024 at 3:04 pm
I wasn't sure StatusDate included a time. If it's just a date, it's not safe to ORDER BY it.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 27, 2024 at 3:41 pm
LAG, which compares rows within the window partition, is not strictly necessary imo. JOIN relates to a lookup table containing the values sequence. Also, I'm not a fan of ordering the window partition by an ID column. This query uses ORDER BY StatusDate
with rn_cte as (
select *, row_number() over (partition by PrID order by StatusDate) rn
from #tblPresentationStatus)
select *
from rn_cte rn
join (values ('Review', 1),
('Approve', 2),
('Presentation', 3),
('Close', 4)) v(PrStatus, n) on rn.PrStatus=v.PrStatus
and rn.rn>v.n;
I think this approach relies on data quality that isn't necessarily guaranteed. For instance, if the two statuses were "Presentation" moving to "Approve", the LAG()
method would catch it, but this approach would not.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
November 27, 2024 at 8:02 pm
I think this approach relies on data quality that isn't necessarily guaranteed. For instance, if the two statuses were "Presentation" moving to "Approve", the
LAG()
method would catch it, but this approach would not.
Maybe so. As far as we know the table has no constraints and, as you noted, all the columns are NULL'able. Crap data could mess up all three queries imo. ORDER BY StatusID assumes the ID's are monotonic and sequential which seems a riskier premise than StatusDate (which is of type DATETIME) containing date and time. It would be nice of the OP to let us know
Happy Thanksgiving!
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
December 2, 2024 at 7:23 pm
Thanks for the response.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply