September 20, 2017 at 11:25 am
Ok, so here is my sample data I am testing against...
declare @t table (
Id int ,
Section int,
DateOccured DATE,
Missed int
);
insert into @t values
( 1 , 1 , '2014-01-01', 0),
( 2 , 1 , '2014-01-04', 1),
( 3 , 1 , '2014-01-06', 1),
( 4 , 1 , '2014-01-12', 0),
( 5 , 1 , '2014-02-05', 0),
( 6 , 1 , '2014-02-06', 1),
( 7 , 1 , '2014-02-20', 1),
( 8 , 1 , '2014-03-01', 1),
( 9 , 1 , '2014-03-05', 0),
( 10 , 1 , '2014-03-22', 0),
( 11 , 2 , '2014-02-06', 0),
( 12 , 2 , '2014-02-07', 1),
( 13 , 2 , '2014-02-08', 1),
( 14 , 2 , '2014-02-22', 0),
( 15 , 2 , '2014-02-24', 0),
( 16 , 2 , '2014-02-27', 0),
( 17 , 2 , '2014-02-28', 1),
( 18 , 2 , '2014-03-09', 0),
( 19 , 3 , '2014-03-11', 0),
( 20 , 3 , '2014-04-02', 0),
( 21 , 3 , '2014-04-13', 1),
( 22 , 3 , '2014-04-14', 0),
( 23 , 3 , '2014-05-01', 0),
( 24 , 3 , '2014-05-12', 1),
( 25 , 3 , '2014-05-14', 1),
( 26 , 3 , '2014-06-01', 1);
What I want to return is a list of sections that have 3 consecutive dates with the missed flags set to 1. Note consecutive dates are not 1st, 2nd, 3rd... It is 3 dates when ordered by date so for section 3, it should be returned as it had 3 records missed in a row occurring on 2014-05-12, 2014-05-14 and 2014-06-01. Section 2 wouldn't be returned as it only had 2 in a row, but section 1 would be as it had 3 consecutive missed slots.
Can anyone help? I'm thinking I might need to resort or a cursor and step through the data row by row.
This is as far as I had got...
SELECT TOP 1 WITH TIES
Id,
Section,
DateOccured
FROM @t
ORDER BY case when row_number() over(partition by Section order by DateOccured) <= 3 then 0 else 1 end;
September 20, 2017 at 11:52 am
This should do it - no CURSOR required.WITH Missed
AS
(
SELECT
t.Section
, t.Missed
, PrevMissed = LAG(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
, NextMissed = LEAD(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
FROM @t t
)
SELECT m.Section
FROM Missed m
WHERE
m.PrevMissed = 1
AND m.Missed = 1
AND m.NextMissed = 1
GROUP BY m.Section;
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 20, 2017 at 12:01 pm
Phil, you are a legend!
One further small request for help... If I wanted to make it so that if the last occurence was not missed, the record was not returned, (in the sample data, section 1 would be excluded since (even though there were 3 missed in a row, the last one was not missed), how would I achieve that?
Many Thanks for taking the time to help me.
September 20, 2017 at 12:11 pm
Charlottecb - Wednesday, September 20, 2017 12:01 PMPhil, you are a legend!One further small request for help... If I wanted to make it so that if the last occurence was not missed, the record was not returned, (in the sample data, section 1 would be excluded since (even though there were 3 missed in a row, the last one was not missed), how would I achieve that?
Many Thanks for taking the time to help me.
Here's a quick solution. It can almost certainly be improved on, if performance is not so good.WITH Missed
AS
(
SELECT
t.Section
, t.Missed
, PrevMissed = LAG(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
, NextMissed = LEAD(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
, LastMissed = lm.LastMissed
FROM
@t t
CROSS APPLY
(
SELECT TOP 1
LastMissed = t2.Missed
FROM @t t2
WHERE t.Section = t2.Section
ORDER BY t2.DateOccured DESC
) lm
)
SELECT m.Section
FROM Missed m
WHERE
m.PrevMissed = 1
AND m.Missed = 1
AND m.NextMissed = 1
AND m.LastMissed <> 0
GROUP BY m.Section;
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 20, 2017 at 12:19 pm
Perfect Phil. Many, many thanks for your help.
September 20, 2017 at 1:11 pm
Phil Parkin - Wednesday, September 20, 2017 12:11 PMCharlottecb - Wednesday, September 20, 2017 12:01 PMPhil, you are a legend!One further small request for help... If I wanted to make it so that if the last occurence was not missed, the record was not returned, (in the sample data, section 1 would be excluded since (even though there were 3 missed in a row, the last one was not missed), how would I achieve that?
Many Thanks for taking the time to help me.Here's a quick solution. It can almost certainly be improved on, if performance is not so good.
WITH Missed
AS
(
SELECT
t.Section
, t.Missed
, PrevMissed = LAG(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
, NextMissed = LEAD(t.Missed, 1) OVER (PARTITION BY t.Section
ORDER BY
t.Section
, t.DateOccured
)
, LastMissed = lm.LastMissed
FROM
@t t
CROSS APPLY
(
SELECT TOP 1
LastMissed = t2.Missed
FROM @t t2
WHERE t.Section = t2.Section
ORDER BY t2.DateOccured DESC
) lm
)
SELECT m.Section
FROM Missed m
WHERE
m.PrevMissed = 1
AND m.Missed = 1
AND m.NextMissed = 1
AND m.LastMissed <> 0
GROUP BY m.Section;
This method takes 27 scans of the table. The following only requires one.
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MissedDays,
ROW_NUMBER() OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS rn
FROM @t
)
SELECT Section
FROM Missed
WHERE rn = 1
AND MissedDays = 3;
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
September 20, 2017 at 1:31 pm
I updated this to remove an expensive sort. The SUM was sorting ascending by the date, but the ROW_NUMBER was sorting descending, so two different sorts were required. I changed it so that both were sorting descending, and only one sort was required.
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS MissedDays,
ROW_NUMBER() OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS rn
FROM @t
)
SELECT Section
FROM Missed
WHERE rn = 1
AND MissedDays >= 3;
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
September 20, 2017 at 2:34 pm
Drew, that's a great solution. Many thanks.
September 21, 2017 at 12:48 pm
No, what you posted is crap. It is not a table because it has no key and all of the names of your columns are nullable so it can never have a key.
And then you wrote with a BIT flag! That's assembly language! In RDBMS. We discovered the state of a single being by not setting a flag, but by using predicates. Basically your approach to RDBMS and SQL is wrong not little wrong, but fundamentally wrong. Where is key in your postings??
CREATE TABLE Foobar
(something_section INTEGER NOT NULL,
occurence_date DATE NOT NULL,
PRIMARY KEY (something_section, occurence_date),
stupid_flg CHAR(1) NOT NULL
CHECK (stupid_flg IN ('0', '1')
);
Everything else you posted is total garbage, and makes me wonder why I spent writing books and all those decades of my life setting up SQL standards. Your table declaration has an extra column which, should not be there course, it looks like a sequential number for the physical row counts.
INSERT INTO Foobar VALUES
(1 , '2014-01-01', '0'),
(1 , '2014-01-04', '1')
(1 , '2014-01-06', '1'),
(1 , '2014-01-12', '0'),
(1 , '2014-02-05', '0'),
(1 , '2014-02-06', '1'),
(1 , '2014-02-20', '1'),
(1 , '2014-03-01', '1'),
(1 , '2014-03-05', '0'),
(1 , '2014-03-22', '0'),
(2 , '2014-02-06', '0'),
(2 , '2014-02-07', '1'),
(2 , '2014-02-08', '1'),
(2 , '2014-02-22', '0'),
(2 , '2014-02-24', '0'),
(2 , '2014-02-27', '0'),
(2 , '2014-02-28', '1'),
(2 , '2014-03-09', '0'),
(3 , '2014-03-11', '0'),
(3 , '2014-04-02', '0'),
(3 , '2014-04-13', '1'),
(3 , '2014-04-14', '0'),
(3 , '2014-05-01', '0'),
(3 , '2014-05-12', '1'),
(3 , '2014-05-14', '1'),
(3 , '2014-06-01', '1');
>> what I want to return is a list of something_sections that have 3 consecutive dates with the missed flags [sic] set to 1. Note consecutive dates are not 1st, 2nd, 3rd... It is 3 dates when ordered by date so for something_section 3, it should be returned as it had 3 records [sic: rows are not records] missed in a row occurring on 2014-05-12, 2014-05-14 and 2014-06-01. something_section 2 wouldn't be returned as it only had 2 in a row, but something_section 1 would be as it had 3 consecutive missed slots. <<
Why are you thinking of a consecutive sequential solution? Switch your mindset to sets and groups.. First we need a calendar table, then ordinal date number and it. Next, filter out the dates the don't have the nonrelational flag. If the minimum and the maximum dates are not three days, then they are out of your set. Suddenly your question is very, very easy; all that you to go ahead and finish writing it with the assumption of
Please post DDL and follow ANSI/ISO standards when asking for help.
September 21, 2017 at 1:54 pm
Joe, chill out man - no one died, it's just a bit of code... It's not even a real table or data - I only created it to highlight my issue and provide something which is quick and easy to test against.
But thanks for your "constructive criticism"... I actually have one of your books so I'll be sure to re-read it again to make sure I'm doing things right 🙂
September 30, 2017 at 5:28 am
drew.allen - Wednesday, September 20, 2017 1:31 PMI updated this to remove an expensive sort. The SUM was sorting ascending by the date, but the ROW_NUMBER was sorting descending, so two different sorts were required. I changed it so that both were sorting descending, and only one sort was required.
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS MissedDays,
ROW_NUMBER() OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS rn
FROM @t
)
SELECT Section
FROM Missed
WHERE rn = 1
AND MissedDays >= 3;Drew
Just wanted to confirm that this might also work ?
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS MissedDays,
first_value(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS frstvalue
FROM @t
)
SELECT Section
FROM Missed
WHERE frstvalue = 1
AND MissedDays >= 3;
First solve the problem then write the code !
October 3, 2017 at 1:19 pm
TheCTEGuy - Saturday, September 30, 2017 5:28 AMdrew.allen - Wednesday, September 20, 2017 1:31 PMI updated this to remove an expensive sort. The SUM was sorting ascending by the date, but the ROW_NUMBER was sorting descending, so two different sorts were required. I changed it so that both were sorting descending, and only one sort was required.
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS MissedDays,
ROW_NUMBER() OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS rn
FROM @t
)
SELECT Section
FROM Missed
WHERE rn = 1
AND MissedDays >= 3;Drew
Just wanted to confirm that this might also work ?
;
WITH Missed AS
(
SELECT *,
SUM(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS MissedDays,
first_value(Missed) OVER(PARTITION BY Section ORDER BY DateOccured DESC) AS frstvalue
FROM @t
)
SELECT Section
FROM Missed
WHERE frstvalue = 1
AND MissedDays >= 3;
I haven't tested, but I believe it would, but it would also be much less efficient, because FIRST_VALUE() requires a frame, and will use the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW if one is not specified whereas ROW_NUMBER() cannot use a frame. I know that specifying frames using ROWS rather than RANGE is much more efficient, because RANGE always writes out to a temp table, but ROWS will only write out to a temp table when the number of rows is over some amount (10000?) and for many aggregate functions there are only ever two rows (the current and the previous values).
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 3, 2017 at 1:31 pm
Okay. I did a quick test, and my intuition bore out:
Here are the results:Method duration
ROW_NUMBER 14002
FIRST_VALUE/ROWS 24000
FIRST_VALUE/RANGE(Unspecified) 54004
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply