April 29, 2020 at 6:05 am
Hi all,
I need your help. This sounds like a simple problem but it's driving me round the bend.
I have two tables that are not technically joined but have a common data column (country) used across the two tables. Each table has a Start/End date.
I am trying to merge the data into a new table the values from the first two but importantly to revise the Start/End dates in the target table to reflect the valid combined data.
I really dont have any clue how to work out the valid new start/end dates in the target table from those in the original two. Hopefully the attached will bring it to life.
Apologies but I don't know where to start. Ideally it would be a stored procedure that could be run as required.
Any assistance greatly appreciated.
Mike
April 29, 2020 at 5:41 pm
I think my only question here is what do you consider a "valid" revised start/end dates? for example, if you had this:
start end
Dec 1,2020 Dec 10, 2020
Dec 12,2020 Dec 13,2020
what is the valid date? OR what if they overlap? Or what if they are DRASTICALLY different (like January 1st, 2020 and January 1st, 2000)?
If you just want the minimum and start and maximum end, then I'd be looking at the MIN and MAX functions. Depending on column names, you may need a CASE statement too. If it is just MIN and MAX, I'd first use a CTE (or nested select) to UNION the data (presuming columns match up or CAN match up) then pick the MIN and MAX dates as the start and end partitioned by Country.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
April 29, 2020 at 8:37 pm
First, most people are hesitant to open attached files from random strangers on the net. You should provide scripts using the {;} Insert/edit code sample
button just above the text box for your question. The scripts should CREATE TABLE scripts and scripts to INSERT data into said tables. Here is how your data should have been posted.
DROP TABLE IF EXISTS #COUNTRY_CURRENCY;
CREATE TABLE #COUNTRY_CURRENCY
(
id TINYINT NOT NULL PRIMARY KEY
,CountryCode CHAR(3) NOT NULL
,CurrencyCode CHAR(3) NOT NULL
,EffectiveDate DATE NOT NULL
,EndDate DATE NOT NULL
);
INSERT #COUNTRY_CURRENCY (id, CountryCode, CurrencyCode, EffectiveDate, EndDate)
VALUES
(1, 'UK', 'GBP', '1900-01-01', '9999-12-30')
,(2, 'FRA', 'FFR', '1900-01-01', '3/2/2003')
,(3, 'FRA', 'EUR', '2003-03-03', '9999-12-30')
,(4, 'USA', 'USD', '1990-01-01', '9999-12-30');
DROP TABLE IF EXISTS #COUNTRY_RULE;
CREATE TABLE #COUNTRY_RULE
(
id TINYINT NOT NULL PRIMARY KEY
,CountryCode CHAR(3) NOT NULL
,[Rule] CHAR(5) NOT NULL
,EffectiveDate DATE NOT NULL
,EndDate DATE NOT NULL
);
INSERT #COUNTRY_RULE (id, CountryCode, [Rule], EffectiveDate, EndDate)
VALUES
(1, 'UK', 'Rule1', '1900-01-01', '2018-06-14')
,(2, 'UK', 'Rule2', '2018-06-15', '9999-12-30')
,(3, 'FRA', 'Rule3', '1900-01-01', '2001-09-17')
,(4, 'FRA', 'Rule4', '2001-09-18', '9999-12-30')
,(5, 'USA', 'Rule5', '1990-01-01', '2006-07-21')
,(6, 'USA', 'Rule6', '2006-07-22', '9999-12-30');
DROP TABLE IF EXISTS #DESIRED_RESULTS;
CREATE TABLE #DESIRED_RESULTS
(
id TINYINT NOT NULL PRIMARY KEY
,CountryCode CHAR(3) NOT NULL
,CurrencyCode CHAR(3) NOT NULL
,[Rule] CHAR(5) NOT NULL
,EffectiveDate DATE NOT NULL
,EndDate DATE NOT NULL
);
INSERT #DESIRED_RESULTS (id, CountryCode, CurrencyCode, [Rule], EffectiveDate, EndDate)
VALUES
(1, 'UK', 'GBP', 'Rule1', '1900-01-01', '2018-06-14')
,(2, 'UK', 'GBP', 'Rule2', '2018-06-15', '9999-12-30')
,(3, 'FRA', 'FFR', 'Rule3', '1990-01-01', '2001-09-17')
,(4, 'FRA', 'FFR', 'Rule4', '2001-09-18', '2003-03-02')
,(5, 'FRA', 'EUR', 'Rule4', '2003-03-03', '9999-12-30')
,(6, 'USA', 'USD', 'Rule5', '1990-01-01', '2006-07-21')
,(7, 'USA', 'USD', 'Rule6', '2006-07-22', '9999-12-30');
Second, there are issues with the way that you have your data set up.
I used a variation on packing intervals to come up with a solution that matches your expected results. You may need to tweak it a bit, because I may not have adequately accounted for all of the issues with respect to converting your intervals to half-closed.
WITH COUNTRY_DATES AS
(
SELECT cc.CountryCode, d.dt
FROM #COUNTRY_CURRENCY AS cc
CROSS APPLY ( VALUEs(cc.EffectiveDate), (DATEADD(DAY, 1, cc.EndDate)) ) d(dt) -- convert closed intervals to half-closed
UNION
SELECT cr.CountryCode, d.dt
FROM #COUNTRY_RULE AS cr
CROSS APPLY ( VALUEs(cr.EffectiveDate), (DATEADD(DAY, 1, cr.EndDate)) ) d(dt) -- convert closed intervals to half-closed
)
, COUNTRY_INTERVALS AS
(
SELECT cd.CountryCode, cd.dt AS EffectiveDate, LEAD(cd.dt, 1) OVER(PARTITION BY cd.CountryCode ORDER BY cd.dt) AS EndDate
FROM COUNTRY_DATES cd
)
SELECT
ROW_NUMBER() OVER(ORDER BY cr.id, ci.EffectiveDate) AS id
,ci.CountryCode
,cc.CurrencyCode
,cr.[Rule]
,ci.EffectiveDate
,DATEADD(DAY, -1, ci.EndDate) AS EndDate -- convert half-closed intervals back to closed
FROM COUNTRY_INTERVALS ci
INNER JOIN #COUNTRY_CURRENCY AS cc
ON ci.CountryCode = cc.CountryCode
AND ci.EffectiveDate < cc.EndDate
AND cc.EffectiveDate < ci.EndDate
INNER JOIN #COUNTRY_RULE AS cr
ON ci.CountryCode = cr.CountryCode
AND ci.EffectiveDate < cr.EndDate
AND cr.EffectiveDate < ci.EndDate
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
April 30, 2020 at 6:51 am
Hi both,
Many thanks for your responses. A few apologies, I never thought of the concern about an embedded file so will provide a script in future. The data we have is sensitive so I mocked up an example but obviously got things a little wrong.
As for the null values, the data is actually stored as strings so does have null values so I can cast those to dates and set the nulls to a distant date as suggested.
I am playing about with the inner join script you provided and it is looking good.
Again, many thanks I really appreciate your help.
Mike
April 30, 2020 at 8:09 pm
>> When using intervals, you should NEVER use NULL values in the intervals, because it makes it much more complicated to do the calculations. I've replaced your NULL values with 9999-12-30.<<
I strongly disagree with this. You just have to remember that when you use a null value for the end of an interval (I am assuming using the ISO half open interval model of time), you have to coalesce it to the current timestamp. This reflects the truth as you know it at the moment. However your dummy date is very optimistic 🙂 did you really want to use it to compute an employee's time in the company?
Everyone should download a copy of Rick Snodgrass is book on temporal queries and SQL. It is a free PDF at the University of Arizona website.
Please post DDL and follow ANSI/ISO standards when asking for help.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply