May 21, 2018 at 9:20 am
I know how to do this with a loop or cursor, but wondered if there was a brilliant mind out there that could suggest a set-based operation?
This is some example data:
I'm working with the first three columns in a CTE and I'm trying to get the result in the green column to return as part of the set.
Basically, where the code and type are the same on consecutive days, the first day is the first date of those days.
So, code 0242628 has SICK20 running consecutively from 3rd May to 5th May and so the first day is 3rd May. There is then a break (6th, 7th and 8th May), before another set runs from 9th May to 12th May, so first date is the 9th.
I cannot figure out a set-based way to do this.
May 21, 2018 at 9:40 am
mmcardle - Monday, May 21, 2018 9:20 AMI know how to do this with a loop or cursor, but wondered if there was a brilliant mind out there that could suggest a set-based operation?This is some example data:
I'm working with the first three columns in a CTE and I'm trying to get the result in the green column to return as part of the set.
Basically, where the code and type are the same on consecutive days, the first day is the first date of those days.
So, code 0242628 has SICK20 running consecutively from 3rd May to 5th May and so the first day is 3rd May. There is then a break (6th, 7th and 8th May), before another set runs from 9th May to 12th May, so first date is the 9th.
I cannot figure out a set-based way to do this.
You've been here long enough to know that you should present your sample data in a consumable format.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 21, 2018 at 9:53 am
looks like a gaps and islands problem to me.
You might find it easier with a Date Tally Table too.
You could RANK() the data and then take the RANK value away from the date. All records with the same calculated date are consecutive days. (NOTE this won't work if you are ignoring weekends/bank holidays)
you could LEFT join the data to itself using a DATEADD() offset to link today and yesterday, then RANK the joined dates to find the 1st rank dates.
May 21, 2018 at 10:02 am
Phil Parkin - Monday, May 21, 2018 9:40 AMYou've been here long enough to know that you should present your sample data in a consumable format.
Sorry Phil, I was rushing and the first attempt at copying and pasting from Excel went awry.
OK, the data can be had from this:
DECLARE @tabData TABLE
(
[Code] varchar(7),
[Date] date,
[Type] varchar(20)
);
INSERT INTO @tabData([Code], [Date], [Type])
SELECT '0242589', '2018-04-20', 'HOLHRSAPR'
UNION
SELECT '0242597', '2018-04-19', 'HOLHRSAPR'
UNION
SELECT '0242628', '2018-04-02', 'HOLHRSAPR'
UNION
SELECT '0242628', '2018-04-03', 'SICK20'
UNION
SELECT '0242628', '2018-04-04', 'SICK20'
UNION
SELECT '0242628', '2018-04-05', 'SICK20'
UNION
SELECT '0242628', '2018-04-09', 'SICK20'
UNION
SELECT '0242628', '2018-04-10', 'SICK20'
UNION
SELECT '0242628', '2018-04-11', 'SICK20'
UNION
SELECT '0242628', '2018-04-12', 'SICK20'
UNION
SELECT '0242640', '2018-04-03', 'HOLHRSAPR'
UNION
SELECT '0242640', '2018-04-09', 'HOLHRSAPR';
SELECT [Code], [Date], [Type], '?' AS [First Day]
FROM @tabData;
May 21, 2018 at 11:11 am
Hope this helps:
DECLARE @tabData TABLE
(
[Code] varchar(7),
[Date] date,
[Type] varchar(20)
);
INSERT INTO @tabData([Code], [Date], [Type])
SELECT '0242589', '2018-04-20', 'HOLHRSAPR'
UNION
SELECT '0242597', '2018-04-19', 'HOLHRSAPR'
UNION
SELECT '0242628', '2018-04-02', 'HOLHRSAPR'
UNION
SELECT '0242628', '2018-04-03', 'SICK20'
UNION
SELECT '0242628', '2018-04-04', 'SICK20'
UNION
SELECT '0242628', '2018-04-05', 'SICK20'
UNION
SELECT '0242628', '2018-04-09', 'SICK20'
UNION
SELECT '0242628', '2018-04-10', 'SICK20'
UNION
SELECT '0242628', '2018-04-11', 'SICK20'
UNION
SELECT '0242628', '2018-04-12', 'SICK20'
UNION
SELECT '0242640', '2018-04-03', 'HOLHRSAPR'
UNION
SELECT '0242640', '2018-04-09', 'HOLHRSAPR';
WITH base AS (
SELECT
.[Code]
,
.[Date]
,
.[Type]
, '?' AS [First Day]
, rn = ROW_NUMBER() OVER (PARTITION BY
.[Code],
.[Type] ORDER BY
.[Date])
-- , [td2].*
FROM
@tabData td
-- LEFT OUTER JOIN @tabData AS [td2]
-- ON [td2].[Code] =
.[Code] AND [td2].[Type] =
.[Type] AND DATEADD(DAY, 1,[td2].[Date]) =
.[Date]
)
SELECT
.[Code]
, .[Date]
, .[Type]
, [First Day] = MIN(.[Date]) OVER (PARTITION BY .[Code], .[Type], DATEADD(DAY,-.[rn],.[Date]))
, .[rn]
, [grpDate] = DATEADD(DAY,-.[rn],.[Date])
FROM
[base] ;
GO
May 21, 2018 at 11:20 am
If you look up dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for most formulas, You will just query the table for where the date matches the date in this table.
May 21, 2018 at 11:42 am
itmasterw 60042 - Monday, May 21, 2018 11:20 AMIf you look up dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for most formulas, You will just query the table for where the date matches the date in this table.
Care to demonstrate how this would work in this situation?
May 21, 2018 at 11:45 am
Lynn Pettis - Monday, May 21, 2018 11:42 AMitmasterw 60042 - Monday, May 21, 2018 11:20 AMIf you look up dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for most formulas, You will just query the table for where the date matches the date in this table.Care to demonstrate how this would work in this situation?
+1, exactly what I was thinking.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 21, 2018 at 12:35 pm
Here's an example using just table joins:
;WITH TabData (Code, Date, Type)
AS
(
SELECT
Code, Date, type
FROM
@tabData
)
SELECT
Full_Data.Code,
Full_Data.Date,
Full_Data.type,
Max(Series_Start.Date) AS Key_Date
FROM
TabData Full_Data
Join
TabData Series_Start
LEFT JOIN
TabData Previous_Date
ON
Series_Start.Code = Previous_Date.Code
AND Series_Start.Type = Previous_Date.Type
AND Series_Start.Date = DateAdd(Day, 1, Previous_Date.Date)
ON
Full_Data.Code = Series_Start.Code
AND Full_Data.Type = Series_Start.Type
AND Full_Data.Date >= Series_Start.Date
AND Previous_Date.Date IS Null
GROUP BY
Full_Data.Code,
Full_Data.Date,
Full_Data.type
ORDER BY
Full_Data.Code,
Full_Data.Date
May 21, 2018 at 12:37 pm
sestell1 - Monday, May 21, 2018 12:35 PMHere's an example using just table joins:
;WITH TabData (Code, Date, Type)
AS
(
SELECT
Code, Date, type
FROM
@tabData
)
SELECT
Full_Data.Code,
Full_Data.Date,
Full_Data.type,
Max(Series_Start.Date) AS Key_Date
FROM
TabData Full_Data
Join
TabData Series_Start
LEFT JOIN
TabData Previous_Date
ON
Series_Start.Code = Previous_Date.Code
AND Series_Start.Type = Previous_Date.Type
AND Series_Start.Date = Previous_Date.Date + 1
ON
Full_Data.Code = Series_Start.Code
AND Full_Data.Type = Series_Start.Type
AND Full_Data.Date >= Series_Start.Date
AND Previous_Date.Date IS Null
GROUP BY
Full_Data.Code,
Full_Data.Date,
Full_Data.type
ORDER BY
Full_Data.Code,
Full_Data.Date
Your code does not run.
Msg 206, Level 16, State 2, Line 56
Operand type clash: date is incompatible with int
May 21, 2018 at 1:37 pm
Well with out putting this together it is hard to fully demonstrate. But unless I am misunderstanding the request, you should be able to do an inner join between his table and the dime data table and return the FirstDayOfMonth. So if his table DateTb, for example then
he would just do something like this:
SELECT FirstDayOfMonth
From DimDate dt INNER JOIN DateTB db
ON dt.date = db.Date
May 21, 2018 at 1:40 pm
itmasterw 60042 - Monday, May 21, 2018 1:37 PMWell with out putting this together it is hard to fully demonstrate. But unless I am misunderstanding the request, you should be able to do an inner join between his table and the dime data table and return the FirstDayOfMonth. So if his table DateTb, for example then
he would just do something like this:
SELECT FirstDayOfMonth
From DimDate dt INNER JOIN DateTB db
ON dt.date = db.Date
But that isn't what the OP is looking for in this case. Go back at look at the original post.
May 21, 2018 at 2:57 pm
Lynn Pettis - Monday, May 21, 2018 12:37 PMsestell1 - Monday, May 21, 2018 12:35 PMHere's an example using just table joins:
;WITH TabData (Code, Date, Type)
AS
(
SELECT
Code, Date, type
FROM
@tabData
)
SELECT
Full_Data.Code,
Full_Data.Date,
Full_Data.type,
Max(Series_Start.Date) AS Key_Date
FROM
TabData Full_Data
Join
TabData Series_Start
LEFT JOIN
TabData Previous_Date
ON
Series_Start.Code = Previous_Date.Code
AND Series_Start.Type = Previous_Date.Type
AND Series_Start.Date = Previous_Date.Date + 1
ON
Full_Data.Code = Series_Start.Code
AND Full_Data.Type = Series_Start.Type
AND Full_Data.Date >= Series_Start.Date
AND Previous_Date.Date IS Null
GROUP BY
Full_Data.Code,
Full_Data.Date,
Full_Data.type
ORDER BY
Full_Data.Code,
Full_Data.DateYour code does not run.
Msg 206, Level 16, State 2, Line 56
Operand type clash: date is incompatible with int
Oh sorry, I was lazy and changed the date column data type to DateTime so I could subtract 1 rather than use DateAdd. :Whistling:
I've updated the code above to use DateAdd.
May 22, 2018 at 3:41 am
The following 2 links are solution to the Packing Intervals Problem by Itzik Ben-Gan.
2011/04/13 ... http://blogs.solidq.com/en/sqlserver/packing-intervals/
2015/08/11 ... http://sqlmag.com/sql-server/new-solution-packing-intervals-problem
May 22, 2018 at 4:12 am
IF OBJECT_ID('tempdb..#tabData') IS NOT NULL DROP TABLE #tabData;
CREATE TABLE #tabData ([Code] varchar(7),[Date] date,[Type] varchar(20));
INSERT INTO #tabData ([Code], [Date], [Type])
VALUES
('0242589', '2018-04-20', 'HOLHRSAPR'),
('0242597', '2018-04-19', 'HOLHRSAPR'),
('0242628', '2018-04-02', 'HOLHRSAPR'),
('0242628', '2018-04-03', 'SICK20'),
('0242628', '2018-04-04', 'SICK20'),
('0242628', '2018-04-05', 'SICK20'),
('0242628', '2018-04-09', 'SICK20'),
('0242628', '2018-04-10', 'SICK20'),
('0242628', '2018-04-11', 'SICK20'),
('0242628', '2018-04-12', 'SICK20'),
('0242640', '2018-04-03', 'HOLHRSAPR'),
('0242640', '2018-04-09', 'HOLHRSAPR');
CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #tabData ([Code], [Type], [Date]);
SELECT [Code], [Date], [Type],
[First Day] = MIN([Date]) OVER(PARTITION BY [Code], [Type], grp)
FROM (
SELECT [Code], [Date], [Type],
grp = DATEADD(DAY,1-ROW_NUMBER() OVER(PARTITION BY [Code], [Type] ORDER BY [Date]),[Date])
FROM #tabData
) d;
EDIT: Exactly the same method as Lynn ๐
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply