May 31, 2019 at 9:13 am
Good Morning,
Firstly my sample data
GO
/****** Object: Table [dbo].[ChargeDataTemp] Script Date: 29/05/2019 15:36:27 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ChargeDataTemp](
[Id] [int] NOT NULL,
[AssetReference] [varchar](50) NOT NULL,
[RentType] [varchar](100) NOT NULL,
[EffectiveFromDate] [datetime] NOT NULL,
[EffectiveToDate] [datetime] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[ChargeDataTemp] ([Id], [AssetReference], [RentType], [EffectiveFromDate], [EffectiveToDate]) VALUES (99062, N'16372', N'Social Rent', CAST(N'2004-08-09T00:00:00.000' AS DateTime), CAST(N'2013-03-31T00:00:00.000' AS DateTime))
INSERT [dbo].[ChargeDataTemp] ([Id], [AssetReference], [RentType], [EffectiveFromDate], [EffectiveToDate]) VALUES (103526, N'16372', N'Social Rent', CAST(N'2009-04-06T00:00:00.000' AS DateTime), CAST(N'2010-04-04T00:00:00.000' AS DateTime))
INSERT [dbo].[ChargeDataTemp] ([Id], [AssetReference], [RentType], [EffectiveFromDate], [EffectiveToDate]) VALUES (104491, N'16372', N'Social Rent', CAST(N'2018-04-02T00:00:00.000' AS DateTime), CAST(N'2019-03-31T00:00:00.000' AS DateTime))
INSERT [dbo].[ChargeDataTemp] ([Id], [AssetReference], [RentType], [EffectiveFromDate], [EffectiveToDate]) VALUES (1449185, N'16372', N'Affordable Rent', CAST(N'2018-04-02T00:00:00.000' AS DateTime), CAST(N'2019-03-31T00:00:00.000' AS DateTime))
INSERT [dbo].[ChargeDataTemp] ([Id], [AssetReference], [RentType], [EffectiveFromDate], [EffectiveToDate]) VALUES (1449190, N'16372', N'Social Rent', CAST(N'2018-06-18T00:00:00.000' AS DateTime), CAST(N'2019-03-31T00:00:00.000' AS DateTime))
What you have in the sample code and the picture below is effectively a list of charges against an asset (property). It also list the charge type.
You will see in the example below that you have one charge for Affordable and many for Social.
The issue I have is not that there are more than one charge for Social, but the fact that the majority of them tend to overlap on the EffectiveFrom and EffectiveTo dates. This is an error in some data I am being provided.
I want to highlight this to the person providing the data so using ID 104491 and 1449190 as an example for this one particular asset and charge one social rent starts on the 02/04/18 and ends on 31/03/19, but another one overlaps and starts on the 18/06/18 as well.
Only one charge can be active at any one time.
Is there a way to run a script to just return rows where there is overlap? I can not work it out.
So it needs to work per AssetReference and ChargeType
Many Thanks for your help.
May 31, 2019 at 9:45 am
You can try self join - the one I have written brings two overlapping ids
SELECT *
FROM #ChargeDataTemp AS cdt
INNER JOIN #ChargeDataTemp AS cdt2
ON cdt2.AssetReference = cdt.AssetReference
AND cdt2.RentType = cdt.RentType
WHERE cdt2.EffectiveFromDate BETWEEN cdt.EffectiveFromDate AND cdt.EffectiveToDate
AND cdt2.id <> cdt.id
May 31, 2019 at 7:33 pm
Kuznetsovโs History Table is a SQL idiom which builds a temporal chain from the current row to the previous row. This is easier to show with code:
CREATE TABLE Tasks
(task_id CHAR(5) NOT NULL,
task_score CHAR(1) NOT NULL,
previous_end_date DATE, -- null means first task
current_start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
CONSTRAINT previous_end_date_and_current_start_in_sequence
CHECK (prev_end_date <= current_start_date),
current_end_date DATE, -- null means unfinished current task
CONSTRAINT current_start_and_end_dates_in_sequence
CHECK (current_start_date <= current_end_date),
CONSTRAINT end_dates_in_sequence
CHECK (previous_end_date <> current_end_date)
PRIMARY KEY (task_id, current_start_date),
UNIQUE (task_id, previous_end_date), -- null first task
UNIQUE (task_id, current_end_date), -- one null current task
FOREIGN KEY (task_id, previous_end_date) -- self-reference
REFERENCES Tasks (task_id, current_end_date));
Well, that looks complicated! Letโs look at it column by column. Task_id explains itself. The previous_end_date will not have a value for the first task in the chain, so it is NULL-able. The current_start_date and current_end_date are the same data elements, temporal sequence and PRIMARY KEY constraints we had in the simple history table schema.
The two UNIQUE constraints will allow one NULL in their pairs of columns and prevent duplicates. Remember that UNIQUE is not like PRIMARY KEY, which implies UNIQUE NOT NULL.
Finally, the FOREIGN KEY is the real trick. Obviously, the previous task has to end when the current task started for them to abut., so there is another constraint. This constraint is a self-reference that makes sure this is true. Modifying data in this type of table is easy, but requires some thought.. Fortunately, Alex has written a Simple Talk article to explain in more detail how it is done.
Disabling Constraints
Just one little problem with that FOREIGN KEY constraint. It will not let you put the first task into the table. There is nothing for the constraint to reference. In Standard SQL, we can declare constraints to be DEFERABLE with some other options. The idea is that you can turn a constraint ON or OFF during a session so the database can be in state that would otherwise be illegal. But at the end of the session all constraints have to be TRUE. or UNKNOWN.
In SQL Server, you can disable constraints and then turn them back on. It actually is restricted to disabling FOREIGN KEY constraint, and CHECK constraints. PRIMARY KEY, UNIQUE, and DEFAULT constraints are always enforced. The syntax for this is part of the ALTER TABLE statement. The syntax is simple:
ALTER TABLE <table name> NOCHECK CONSTRAINT [<constraint name> | ALL];
This is why you want to name the constraints; without user given names, you have to look up what the system gave you and they are always long and messy.. The ALL option will disable all of the constraints in the entire schema. Be careful with it.
To re-enable, the syntax is similar and explains itself:
ALTER TABLE <table name> CHECK CONSTRAINT [<constraint name> | ALL];
When a disabled constraint is re-enabled, the database does not check to ensure any of the existing data meets the constraints. So for this table, The body of a procedure to get things started would look like this:
BEGIN
ALTER TABLE Tasks NOCHECK CONSTRAINT ALL;
INSERT INTO Tasks (task_id, task_score, current_start_date, current_end_date, previous_end_date)
VALUES (1, 'A', '2010-11-01', '2010-11-03', NULL);
ALTER TABLE Tasks CHECK CONSTRAINT ALL;
END;
Please post DDL and follow ANSI/ISO standards when asking for help.
May 31, 2019 at 8:41 pm
Perhaps this article will be of help too:
https://www.sqlservercentral.com/articles/linking-to-the-previous-row
--Vadim R.
May 31, 2019 at 9:01 pm
Since you are on SQL 2017, you can use LEAD()
or LAG()
.
SELECT *, CASE WHEN EffectiveFromDate < LAG(EffectiveToDate) OVER(PARTITION BY RentType ORDER BY EffectiveFromDate, EffectiveToDate) THEN 1 ELSE 0 END
FROM #ChargeDataTemp
ORDER BY RentType, EffectiveFromDate, EffectiveToDate
This only requires one scan of the table rather than the two that are required for a self-join.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
June 5, 2019 at 4:44 pm
How about the following, using LAG:
USE master;
GO
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
CREATE TABLE #ChargeDataTemp (
Id int NOT NULL,
AssetReference varchar(50) NOT NULL,
RentType varchar(100) NOT NULL,
EffectiveFromDate datetime NOT NULL,
EffectiveToDate datetime NULL
);
INSERT INTO #ChargeDataTemp (Id, AssetReference, RentType, EffectiveFromDate, EffectiveToDate)
VALUES (99062, '16372', 'Social Rent', CAST('2004-08-09T00:00:00.000' AS datetime), CAST('2013-03-31T00:00:00.000' AS datetime)),
(103526, '16372', 'Social Rent', CAST('2009-04-06T00:00:00.000' AS datetime), CAST('2010-04-04T00:00:00.000' AS datetime)),
(104491, '16372', 'Social Rent', CAST('2018-04-02T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime)),
(1449185, '16372', 'Affordable Rent', CAST('2018-04-02T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime)),
(1449190, '16372', 'Social Rent', CAST('2018-06-18T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime));
WITH RAW_DATA AS (
SELECT
CDT.Id,
CDT.AssetReference,
CDT.RentType,
CDT.EffectiveFromDate,
CDT.EffectiveToDate,
LAG(CDT.EffectiveFromDate, 1) OVER(PARTITION BY CDT.AssetReference ORDER BY CDT.EffectiveFromDate, CDT.EffectiveToDate) AS LAG_FROM,
LAG(CDT.EffectiveToDate, 1) OVER(PARTITION BY CDT.AssetReference ORDER BY CDT.EffectiveFromDate, CDT.EffectiveToDate) AS LAG_TO
FROM #ChargeDataTemp AS CDT
)
SELECT
RD.Id,
RD.AssetReference,
RD.RentType,
RD.EffectiveFromDate,
RD.EffectiveToDate
FROM RAW_DATA AS RD
WHERE (
RD.LAG_FROM BETWEEN RD.EffectiveFromDate AND RD.EffectiveToDate
OR
RD.LAG_TO BETWEEN RD.EffectiveFromDate AND RD.EffectiveToDate
)
ORDER BY
RD.EffectiveFromDate,
RD.EffectiveToDate;
DROP TABLE #ChargeDataTemp;
Steve (aka sgmunson) ๐ ๐ ๐
Rent Servers for Income (picks and shovels strategy)
June 5, 2019 at 4:46 pm
USE master;
GO
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
CREATE TABLE #ChargeDataTemp (
Id int NOT NULL,
AssetReference varchar(50) NOT NULL,
RentType varchar(100) NOT NULL,
EffectiveFromDate datetime NOT NULL,
EffectiveToDate datetime NULL
);
INSERT INTO #ChargeDataTemp (Id, AssetReference, RentType, EffectiveFromDate, EffectiveToDate)
VALUES(99062, '16372', 'Social Rent', CAST('2004-08-09T00:00:00.000' AS datetime), CAST('2013-03-31T00:00:00.000' AS datetime)),
(103526, '16372', 'Social Rent', CAST('2009-04-06T00:00:00.000' AS datetime), CAST('2010-04-04T00:00:00.000' AS datetime)),
(104491, '16372', 'Social Rent', CAST('2018-04-02T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime)),
(1449185, '16372', 'Affordable Rent', CAST('2018-04-02T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime)),
(1449190, '16372', 'Social Rent', CAST('2018-06-18T00:00:00.000' AS datetime), CAST('2019-03-31T00:00:00.000' AS datetime));
WITH RAW_DATA AS (
SELECT
CDT.Id,
CDT.AssetReference,
CDT.RentType,
CDT.EffectiveFromDate,
CDT.EffectiveToDate,
LAG(CDT.EffectiveFromDate, 1) OVER(PARTITION BY CDT.AssetReference ORDER BY CDT.EffectiveFromDate, CDT.EffectiveToDate) AS LAG_FROM,
LAG(CDT.EffectiveToDate, 1) OVER(PARTITION BY CDT.AssetReference ORDER BY CDT.EffectiveFromDate, CDT.EffectiveToDate) AS LAG_TO
FROM #ChargeDataTemp AS CDT
)
SELECT
RD.Id,
RD.AssetReference,
RD.RentType,
RD.EffectiveFromDate,
RD.EffectiveToDate
FROM RAW_DATA AS RD
WHERE (
RD.LAG_FROM BETWEEN RD.EffectiveFromDate AND RD.EffectiveToDate
OR
RD.LAG_TO BETWEEN RD.EffectiveFromDate AND RD.EffectiveToDate
)
ORDER BY
RD.EffectiveFromDate,
RD.EffectiveToDate;
DROP TABLE #ChargeDataTemp;
Looks like how one is supposed to insert code has changed with the forum software.ย ย Here's a much prettier version.
Steve (aka sgmunson) ๐ ๐ ๐
Rent Servers for Income (picks and shovels strategy)
June 7, 2019 at 10:52 am
Thanks to all that have help with this issue
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply