September 28, 2021 at 6:16 pm
Hello,
i have a table called Events, the value of that table like as bellow :
create table Events (
id int,
code_events varchar(10),
Events varchar(10),
Events_start datetime,
Events_end datetime
)
insert into Events values(1,'AC_83','Event 1','2020-07-15','2020-07-30')
insert into Events values(2,'AC_84','Event 2','2019-06-01','2030-07-30')
insert into Events values(3,'AC_86','Event 3','2020-07-15','2022-12-15')
insert into Events values(4,'AC_83','Event 1','2020-09-15','2025-04-30')
insert into Events values(5,'AC_87','Event 4','2021-07-15','2022-12-15')
I want to return a range of missing dates from the first date of 2020 and only if Events_start>=2020
As bellow what i want to get :
code_events Events Events_start Events_end IS_Messing
AC_83 Event 1 2020-01-01 00:00:00.000 2020-07-15 00:00:00.000 1
AC_83 Event 1 2020-07-15 00:00:00.000 2020-07-30 00:00:00.000 0
AC_83 Event 1 2020-07-30 00:00:00.000 2020-09-15 00:00:00.000 1
AC_83 Event 1 2020-09-15 00:00:00.000 2025-04-30 00:00:00.000 0
AC_84 Event 2 2019-06-01 00:00:00.000 2030-07-30 00:00:00.000 0
AC_86 Event 3 2020-01-01 00:00:00.000 2020-07-15 00:00:00.000 1
AC_86 Event 3 2020-07-15 00:00:00.000 2022-12-15 00:00:00.000 0
AC_87 Event 4 2020-01-01 00:00:00.000 2021-07-15 00:00:00.000 1
AC_87 Event 4 2021-07-15 00:00:00.000 2022-12-15 00:00:00.000 0
Any idea how can i do that ?
Thanks for help !
September 28, 2021 at 8:20 pm
It's a one time only type query or it's intended to be run frequently? The id value in the last row of the sample data should be 5, no?
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
September 28, 2021 at 8:23 pm
WITH PrevEnds
AS
(
SELECT E.code_events, E.[Events], E.Events_start, E.Events_end
,LAG(E.Events_end) OVER (PARTITION BY E.code_events ORDER BY E.Events_start) AS Prev_Events_end
FROM Events E
)
,Results
AS
(
SELECT P.code_events, P.[Events]
,CASE
WHEN N.N = 1 AND X.Keep1 = 1
THEN COALESCE(P.Prev_Events_end, '20200101')
ELSE P.Events_start
END AS Events_start
,CASE
WHEN N.N = 1 AND X.Keep1 = 1
THEN P.Events_start
ELSE P.Events_end
END AS Events_end
,CASE
WHEN N.N = 1 AND X.Keep1 = 1
THEN 1
ELSE 0
END AS IsMissing
,CASE
WHEN N.N = 2 OR X.Keep1 = 1
THEN 1
ELSE 0
END AS KeepRow
FROM PrevEnds P
CROSS APPLY (VALUES (1),(2)) N(N)
CROSS APPLY
(
VALUES
(
CASE
WHEN P.Events_start <> COALESCE(P.Prev_Events_end, '19000101')
AND P.Events_start > '20200101'
THEN 1
ELSE 0
END
)
) X (Keep1)
)
SELECT code_events, [Events], Events_start, Events_end, IsMissing
FROM Results
WHERE KeepRow = 1
ORDER BY Code_events, Events_start;
September 28, 2021 at 10:01 pm
Ken McKelvey this looks familiar
https://www.sqlservercentral.com/forums/topic/need-to-create-gaps-in-a-table-of-date-ranges
Again one "brute force" way to do this would be to expand all the date ranges and use the set operator EXCEPT to remove the (also expanded) gaps. Then the results could be grouped by splitting the calculated dates into contiguous ranges. Reliable and simple (sort of) but maybe not the fastest way
with
unq_events_cte as (
select code_events, min(Events_end) min_end
from #Events
group by code_events
having year(min(Events_start))>=2020),
exp_cte as (
select code_events, v.dt
from unq_events_cte e
cross apply dbo.fnTally(0, datediff(day, '20200101', iif(e.min_end<'20201231', '20201231', e.min_end))) fn
cross apply (values (dateadd(day, fn.n, '20200101'))) v(dt)
except
select code_events, v.dt
from #Events e
cross apply dbo.fnTally(0, datediff(day, e.Events_start, e.Events_end)) fn
cross apply (values (dateadd(day, fn.n, e.Events_start))) v(dt)),
gap_cte as (
select *, case when datediff(day, dt, lag(dt, 1, 1)
over (partition by code_events
order by dt))<>-1
then 1 else 0 end gap
from exp_cte),
grp_cte as (
select *, sum(gap) over (partition by code_events order by dt) grp
from gap_cte)
select code_events,
iif(min(dt)='20200101', '20200101', dateadd(day, 1, min(dt))) Events_start,
iif(max(dt)='20201231', '20201231', dateadd(day, 1, max(dt))) Events_end,
1 Is_missing
from grp_cte
group by code_events, grp
union all
select code_events, Events_start, Events_end, 0
from #Events
order by code_events, Events_start;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
September 29, 2021 at 1:38 am
>> I have a table called Events, the value of that table like as bellow :<<
Actually you don't have a table in your posting. By definition, a table must have a key and that means it's columns cannot all be NULL. You also don't know there's no such thing as a generic "_id" in RDBMS. It must be the identifier of something in particular, it can't be a numeric because it's always measured on a nominal scale.
You might also want to read the ISO 11179 standards and learn how to properly name columns. For example, since a column is made of rows, and each row is made of columns, you can't have plural column names. They are by their very nature, in the definition of RDBMS, scalar values.
CREAT TABLE Events
(event_code VARCHAR(10) NOT NULL
CHECK (event_code IN (..)),
event_name VARCHAR(10) NOT NULL,
event_start_date DATE NOT NULL,
event_end_date DATE,
CHECK (event_start_date <= event_end_date),
PRIMARY KEY (event_code, event_name, event_start_date)
);
Did you know we've had a DATE data type for several years now? There is no need to carry the time and in fact simply waste storage. You're still writing that old Sybase code!
Please note the use of a check clause to guarantee that your codes are drawn from a list. But it looks like what you're calling a code is not what we call a code. It looks more like the name of an event with the year attached to it. Think of an area code and phone numbe, as an example of what a code is. You also want to use a check constraint to be sure the start and end dates are in order. Notice the use of a null for the end date of an event; this means event is still ongoing and has not yet completed. You want to download a copy of temporal queries and SQL by Rick Snodgrass. It's a free PDF and the University of Arizona on their website.
Unfortunate this is still pretty bad design, as far as I can tell. This is why you wind up with a three column primary key when a unique event name would have allowed a single column. You will eventually discover that most of the work and SQL is done in the DDL, and not the DML you have to do elaborate things in the DML, it's usually because you screwed up the basic design and have to include everything.
We've had no need for the old-fashioned Sybase insertion one at a time. If you put in a table constructor, this will let the optimizer stand a chance of picking the best way to do this. If you do it as if you were still using punch cards, the SQL engine is obligated to treat them as punchcards one at a time. The semi-colon was also left out of Sybase in the early days, but now many of the SQL Server statements require it. Since it is ANSI and ISO standard. You might want to start actually writing real SQL and not dialect.
INSERT INTO Events
VALUES
('AC_83', 'Event 1', '2020-07-15', '2020-07-30'),
('AC_83', 'Event 1', '2020-09-15', '2025-04-30'),
('AC_84', 'Event 2', '2019-06-01', '2030-07-30'),
('AC_86', 'Event 3', '2020-07-15', '2022-12-15'),
('AC_87', 'Event 4', '2021-07-15', '2022-12-15');
>> I want to return a range of missing dates from the first date of 2020 and only if event_start>=2020 <<
Why are you writing SQL as if it was assembly language? We don't use bit flags in high-level languages, and in particular since oriented high-level languages. Again, this is designed failure will lead to elaborate queries and poor performance.
As below what I want to get :
code_events Events event_start event_end is_missing_flg
AC_83 Event 1 2020-07-15 2020-07-30 0
AC_83 Event 1 2020-07-30 2020-09-15 1
AC_83 Event 1 2020-09-15 2025-04-30 0
AC_84 Event 2 2019-06-01 2030-07-30 0
AC_86 Event 3 2020-01-01 2020-07-15 1
AC_86 Event 3 2020-07-15 2022-12-15 0
AC_87 Event 4 2020-01-01 2021-07-15 1
AC_87 Event 4 2021-07-15 2022-12-15 0
did you notice that the end of one of your intervals is the start of the following interval? So the event status at the intersection of these two intervals is ambiguous. Is it missing or not missing? Or is it a quantum thin so it is both?
Do you have a calendar table? Instead of actually materializing the missing data flags, you could compute them with a simple lookup. But right now your specs are too vague to make any judgments. Why don't you post specs that tell us what you're actually trying to do?.
Please post DDL and follow ANSI/ISO standards when asking for help.
September 29, 2021 at 10:27 am
this looks familiar
I think both windowed and brute force approaches have their place. I have just tidied up my code to try and make it more understandable:
WITH PrevEnds
AS
(
-- Get previous Events_end as will be needed to determine if there are missing rows
-- and also for the Events_start value of missing rows
SELECT E.code_events, E.[Events], E.Events_start, E.Events_end
,LAG(E.Events_end)
OVER (PARTITION BY E.code_events ORDER BY E.Events_start) AS Prev_Events_end
FROM Events E
)
SELECT P.code_events, P.[Events]
,CASE
WHEN X.IsMissing = 1
THEN COALESCE(P.Prev_Events_end, '2020')
ELSE P.Events_start
END AS Events_start
,CASE
WHEN X.IsMissing = 1
THEN P.Events_start
ELSE P.Events_end
END AS Events_end
,X.IsMissing
FROM PrevEnds P
-- The maximum number of missing rows = the number of original rows
CROSS APPLY (VALUES ('PossibleMissing'),('Original')) R(RowType)
-- Determine if the row is missing
CROSS APPLY
(
VALUES
(
CASE
WHEN R.RowType = 'PossibleMissing'
AND P.Events_start <> COALESCE(P.Prev_Events_end, '1900')
AND P.Events_start > '2020'
THEN 1
ELSE 0
END
)
) X (IsMissing)
WHERE R.RowType = 'Original' OR X.IsMissing = 1
ORDER BY Code_events, Events_start;
September 29, 2021 at 7:46 pm
Steve Collins wrote:this looks familiar
I think both windowed and brute force approaches have their place. I have just tidied up my code to try and make it more understandable:
It looks good. Next time I'm going to start with your code 🙂
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply