July 31, 2018 at 2:37 pm
My attempts failed. I need a SELECT STMT that will give me 2 rows as output. The 2 rows must have 2 spans as follows
--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231
You know what i am trying here.. I am trying to condense nearby spans that are continuous together. Hope this explains.. You may assume that the table has data for many members. But for sake of simplicity lets have data for just one buddy.
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),
('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change')
;
With C as
(
Select emp, St, Et ,
LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,DENSE_RANK() OVER (PARTITION BY emp ORDER BY St) AS Grp2
, DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF
FROM
#DLT
)
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
FROM
C
--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231
July 31, 2018 at 3:32 pm
There are 11 possible relationships between two intervals, requiring at least 22 rows to represent completely. You have provided less than a quarter of the minimum to adequately test anything.
Furthermore, you have another thread of this very subject where you posted just 59 minutes ago that "This is perfect". Did you read the article that I linked to in that thread? It walks through exactly what you are trying to do.
This is just another example of you moving the goal posts. Read the article that I mentioned above and post code that reflects the techniques in that article, and then I will help you if you are still having problems.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
July 31, 2018 at 3:38 pm
mw_sql_developer - Tuesday, July 31, 2018 2:37 PMMy attempts failed. I need a SELECT STMT that will give me 2 rows as output. The 2 rows must have 2 spans as follows
--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231You know what i am trying here.. I am trying to condense nearby spans that are continuous together. Hope this explains.. You may assume that the table has data for many members. But for sake of simplicity lets have data for just one buddy.
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change');
With C as
(
Select emp, St, Et ,
LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,DENSE_RANK() OVER (PARTITION BY emp ORDER BY St) AS Grp2
, DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF
FROM
#DLT
)
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
FROM
C--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231
Does this question have anything to do with this thread: https://www.sqlservercentral.com/Forums/1977582/SQL-Syntax-HELP-on-merging-enrollment-spans#bm1979663
July 31, 2018 at 3:46 pm
A sample solution (not an efficient or "good" one but it works with the sample data you gave, but no promises it would work with "real" data, and I imagine there are better/faster ways to solve this) could be:
IF OBJECT_ID('tempdb..#DLT') IS NOT NULL
DROP TABLE [#DLT];
CREATE TABLE [#DLT]
(
[emp] VARCHAR(13)
, [St] VARCHAR(8)
, [Et] VARCHAR(8)
, [RSN] VARCHAR(100)
);
INSERT INTO [#DLT]
(
[emp]
, [St]
, [Et]
, [RSN]
)
VALUES
(
'0010970115'
, '20160701'
, '20170131'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20170201'
, '20171130'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20171201'
, '20171231'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20180105'
, '20180531'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20180601'
, '29991231'
, 'Member Eligibility Change'
);
WITH [C]
AS ( SELECT
[#DLT].[emp]
, [#DLT].[St]
, [#DLT].[Et]
, DATEDIFF(
D
, [#DLT].[Et]
, ( LEAD([#DLT].[St]) OVER ( PARTITION BY
[#DLT].[emp]
ORDER BY
[#DLT].[emp]
, [#DLT].[St]
)
)
) AS [DIFF]
, MIN([#DLT].[St]) OVER ( PARTITION BY
[#DLT].[emp]
ORDER BY
[#DLT].[St]
) AS [minStart]
FROM
[#DLT] )
, [D]
AS ( SELECT
[C].[emp]
, [C].[St]
, [C].[Et]
, [C].[DIFF]
, [C].[minStart]
, CASE
WHEN [C].[St] = [C].[minStart] THEN
[C].[St]
WHEN
(
[C].[DIFF] = 1
AND LAG([C].[DIFF]) OVER ( PARTITION BY
[C].[emp]
ORDER BY
[C].[St]
) > 1
) THEN
[C].[Et]
WHEN [C].[DIFF] = 1 THEN
NULL
WHEN [C].[DIFF] IS NULL THEN
[C].[Et]
ELSE
[C].[Et]
END AS [daterange]
FROM
[C] )
, [E]
AS ( SELECT
[D].[emp]
, [D].[daterange]
, ROW_NUMBER() OVER ( ORDER BY
[D].[emp]
) AS [rn]
, LEAD([D].[daterange]) OVER ( ORDER BY
[D].[emp]
) AS [nextDate]
FROM
[D]
WHERE
[D].[daterange] IS NOT NULL )
SELECT
[E].[emp]
, [E].[daterange] + ' - ' + [E].[nextDate]
FROM
[E]
WHERE
[E].[rn] % 2 != 0;
A lot of that is copied from the code you provided, I added the CTE D and E (which are both horribly named and I do not recommend using those names). This feels like a specialized "gap and island" problem, so D is used to determine what is a gap and what is an island while E is used to strip out the "middle" parts of the islands and get the end date (next Date should be the end date).
Now I am fairly confident that there are better more efficient ways to do this, but this method does work for the given data. There is some data that will break it (like if a range resides entirely in 1 row), but with the given sample data, it works.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
July 31, 2018 at 6:53 pm
Lynn Pettis - Tuesday, July 31, 2018 3:38 PMmw_sql_developer - Tuesday, July 31, 2018 2:37 PMMy attempts failed. I need a SELECT STMT that will give me 2 rows as output. The 2 rows must have 2 spans as follows
--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231You know what i am trying here.. I am trying to condense nearby spans that are continuous together. Hope this explains.. You may assume that the table has data for many members. But for sake of simplicity lets have data for just one buddy.
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change');
With C as
(
Select emp, St, Et ,
LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,DENSE_RANK() OVER (PARTITION BY emp ORDER BY St) AS Grp2
, DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF
FROM
#DLT
)
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
FROM
C--Desired Output for member 20160701 - 20171231
-- 20180105 - 29991231Does this question have anything to do with this thread: https://www.sqlservercentral.com/Forums/1977582/SQL-Syntax-HELP-on-merging-enrollment-spans#bm1979663
Lynn: No, THat was actually about creating new spans. Here I am trying to make one span using 3 spans ( similar ) So if you have spans that are end to end then we can condense them to one span.
August 1, 2018 at 8:23 am
mw_sql_developer - Tuesday, July 31, 2018 6:53 PMLynn: No, THat was actually about creating new spans. Here I am trying to make one span using 3 spans ( similar ) So if you have spans that are end to end then we can condense them to one span.
I think they are the same. They both concerned with determining what makes a span a span. Knowing when a single span should be broken into multiple spans or vice versa are just different sides of the same coin.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
August 1, 2018 at 8:32 am
drew.allen - Wednesday, August 1, 2018 8:23 AMmw_sql_developer - Tuesday, July 31, 2018 6:53 PMLynn: No, THat was actually about creating new spans. Here I am trying to make one span using 3 spans ( similar ) So if you have spans that are end to end then we can condense them to one span.I think they are the same. They both concerned with determining what makes a span a span. Knowing when a single span should be broken into multiple spans or vice versa are just different sides of the same coin.
Drew
Drew: That is not true. Earlier we were taking multiple spans, looking at how they overlapped each other and came up with a new set of plans. So if you started with 3 spans you can end up generating six spans. This is different. if you have six spans and if they are end to end , then you condense it to one span.
Similar but not the same..
August 1, 2018 at 8:40 am
mw_sql_developer - Wednesday, August 1, 2018 8:32 AMdrew.allen - Wednesday, August 1, 2018 8:23 AMmw_sql_developer - Tuesday, July 31, 2018 6:53 PMLynn: No, THat was actually about creating new spans. Here I am trying to make one span using 3 spans ( similar ) So if you have spans that are end to end then we can condense them to one span.I think they are the same. They both concerned with determining what makes a span a span. Knowing when a single span should be broken into multiple spans or vice versa are just different sides of the same coin.
Drew
Drew: That is not true. Earlier we were taking multiple spans, looking at how they overlapped each other and came up with a new set of plans. So if you started with 3 spans you can end up generating six spans. This is different. if you have six spans and if they are end to end , then you condense it to one span.
Similar but not the same..
Actually, very similar to each other. Check the articles on this site regarding gaps and islands.
August 1, 2018 at 10:07 am
Lynn Pettis - Wednesday, August 1, 2018 8:40 AMmw_sql_developer - Wednesday, August 1, 2018 8:32 AMdrew.allen - Wednesday, August 1, 2018 8:23 AMmw_sql_developer - Tuesday, July 31, 2018 6:53 PMLynn: No, THat was actually about creating new spans. Here I am trying to make one span using 3 spans ( similar ) So if you have spans that are end to end then we can condense them to one span.I think they are the same. They both concerned with determining what makes a span a span. Knowing when a single span should be broken into multiple spans or vice versa are just different sides of the same coin.
Drew
Drew: That is not true. Earlier we were taking multiple spans, looking at how they overlapped each other and came up with a new set of plans. So if you started with 3 spans you can end up generating six spans. This is different. if you have six spans and if they are end to end , then you condense it to one span.
Similar but not the same..
Actually, very similar to each other. Check the articles on this site regarding gaps and islands.
Ok, I got it very close, I just need one little help writing the last SELECT stmt. I am sure it might just click you like that... Just run the SQL and you will know what I mean ...
Good Luck ...
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),
('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change')
;
With Step1 as
(
Select emp, St, Et
,LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,ROW_NUMBER() OVER (PARTITION BY emp ORDER BY St) AS THE_ORDER
,DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF -- Gap in days
FROM
#DLT
)
,
Step2 as
(
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
,LAG(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PreDiff
,LEAD(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PostDiff
FROM
Step1
)
,
Step3 as
(
Select
*
,
CASE WHEN (PreDiff is NULL ) OR (PreDiff <> 1 ) THEN
'START_OF_SPAN'
ELSE
CASE WHEN DEC IS NULL THEN 'END_OF_SPAN' ELSE
'YOU ARE INSIDE A SPAN'
END
END as FINAL_DECISION
FROM Step2
)
Select
*
FROM Step2
August 1, 2018 at 11:13 am
Was my solution no good? It seemed to work with the data you provided...
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
August 1, 2018 at 11:14 am
bmg002 - Wednesday, August 1, 2018 11:13 AMWas my solution no good? It seemed to work with the data you provided...
That was good, but now that I also learnt something I need to finish what i started, so some help will be great
August 1, 2018 at 11:27 am
mw_sql_developer - Wednesday, August 1, 2018 11:14 AMbmg002 - Wednesday, August 1, 2018 11:13 AMWas my solution no good? It seemed to work with the data you provided...That was good, but now that I also learnt something I need to finish what i started, so some help will be great
What was missing in that solution? I don't believe it was the most optimal solution, but it did provide you the resulting data you were looking for from the provided data set.
Was there something confusing in it or that you didn't understand why I did a thing? If so, I can try to explain.
There are some things to watch out for with my solution though. I would recommend building up all the possible edge cases that can come up and put them into your sample data. For example, if a customer only has 1 row of data, my solution breaks. Or if the "island" exists entirely in 1 row, the solution breaks. Or if the start date and end date columns can be NULL it'll create problems.
EDIT - whoops... just noticed my code had a typo... working on a fix... it should be taking the st value on the start of a new island not the et value.
EDIT 2 - corrected code (also handles the bug I mentioned above):
IF OBJECT_ID('tempdb..#DLT') IS NOT NULL
DROP TABLE [#DLT];
CREATE TABLE [#DLT]
(
[emp] VARCHAR(13)
, [St] VARCHAR(8)
, [Et] VARCHAR(8)
, [RSN] VARCHAR(100)
);
INSERT INTO [#DLT]
(
[emp]
, [St]
, [Et]
, [RSN]
)
VALUES
(
'0010970115'
, '20160701'
, '20170131'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20170201'
, '20171130'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20171201'
, '20171231'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20180105'
, '20180531'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '20180601'
, '20991231'
, 'Member Eligibility Change'
)
, (
'0010970115'
, '22030101'
, '29991231'
, 'Member Eligibility Change'
);
WITH [C]
AS ( SELECT
[#DLT].[emp]
, [#DLT].[St]
, [#DLT].[Et]
, DATEDIFF(
D
, [#DLT].[Et]
, ( LEAD([#DLT].[St]) OVER ( PARTITION BY
[#DLT].[emp]
ORDER BY
[#DLT].[emp]
, [#DLT].[St]
)
)
) AS [DIFF]
, MIN([#DLT].[St]) OVER ( PARTITION BY
[#DLT].[emp]
ORDER BY
[#DLT].[St]
) AS [minStart]
FROM
[#DLT] )
, [D]
AS ( SELECT
[C].[emp]
, [C].[St]
, [C].[Et]
, [C].[DIFF]
, [C].[minStart]
, CASE
WHEN [C].[St] = [C].[minStart] THEN
[C].[St]
WHEN
(
[C].[DIFF] = 1
AND LAG([C].[DIFF]) OVER ( PARTITION BY
[C].[emp]
ORDER BY
[C].[St]
) > 1
) THEN
[C].[St]
WHEN [C].[DIFF] = 1 THEN
NULL
WHEN [C].[DIFF] IS NULL THEN
[C].[Et]
ELSE
[C].[Et]
END AS [daterange]
FROM
[C] )
, [E]
AS ( SELECT
[D].[St]
, [D].[emp]
, [D].[daterange]
, ROW_NUMBER() OVER ( ORDER BY
[D].[emp]
) AS [rn]
, LEAD([D].[daterange]) OVER ( ORDER BY
[D].[emp]
) AS [nextDate]
FROM
[D]
WHERE
[D].[daterange] IS NOT NULL )
SELECT
[E].[emp]
, ISNULL(
[E].[daterange] + ' - ' + [E].[nextDate]
, [E].[St] + ' - ' + [E].[daterange]
)
FROM
[E]
WHERE
[E].[rn] % 2 != 0;
I adjusted the data set to show that it can handle a single row island and fixed the bug where it was picking the wrong date for the start of a new island.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
August 1, 2018 at 11:57 am
Done! Finally!
I tested against single spans as well... Works beautifully
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970116','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),
('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change')
;
With Step1 as
(
Select emp, St, Et
,LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,ROW_NUMBER() OVER (PARTITION BY emp ORDER BY St) AS THE_ORDER
,DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF -- Gap in days
FROM
#DLT
)
,
Step2 as
(
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
,LAG(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PreDiff
,LEAD(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PostDiff
FROM
Step1
)
,
Step3 as
(
Select
*
,
CASE WHEN (PreDiff is NULL ) OR (PreDiff <> 1 ) THEN
'START_OF_SPAN'
ELSE
CASE WHEN DEC IS NULL THEN 'END_OF_SPAN' ELSE
'YOU ARE INSIDE A SPAN'
END
END as FINAL_DECISION
FROM Step2
)
,
Step4 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'START_OF_SPAN'
)
,
Step5 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'END_OF_SPAN'
),
Step6 as
(
Select A.EMP, A.St,
CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
FROM Step4 A
OUTER APPLY
(
Select top 1 * FROM Step5
WHERE
emp = A.emp
AND
THE_ORDER >= A.THE_ORDER
ORDER BY THE_ORDER
)D
)
Select * FROM Step6
August 1, 2018 at 12:51 pm
mw_sql_developer - Wednesday, August 1, 2018 11:57 AMDone! Finally!
I tested against single spans as well... Works beautifully
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970116','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),
('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change');
With Step1 as
(
Select emp, St, Et
,LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,ROW_NUMBER() OVER (PARTITION BY emp ORDER BY St) AS THE_ORDER
,DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF -- Gap in days
FROM
#DLT
)
,
Step2 as
(
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
,LAG(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PreDiff
,LEAD(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PostDiff
FROM
Step1
)
,
Step3 as
(
Select
*
,
CASE WHEN (PreDiff is NULL ) OR (PreDiff <> 1 ) THEN
'START_OF_SPAN'
ELSE
CASE WHEN DEC IS NULL THEN 'END_OF_SPAN' ELSE
'YOU ARE INSIDE A SPAN'
END
END as FINAL_DECISIONFROM Step2
)
,
Step4 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'START_OF_SPAN'
)
,
Step5 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'END_OF_SPAN'
),
Step6 as
(
Select A.EMP, A.St,
CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
FROM Step4 A
OUTER APPLY
(
Select top 1 * FROM Step5
WHERE
emp = A.emp
AND
THE_ORDER >= A.THE_ORDER
ORDER BY THE_ORDER
)D)
Select * FROM Step6
Discovered a bug in my version of this - a single row customer doesn't work and my output was in the form "<start date> - <end date>" which is what I thought you wanted based on your original post.
Looking through your code, there is a little bit of tidying up you could do, no? Like are you using "PostDiff"? and you could likely remove steps 4, 5 and 6 if you change your final select to:
Select A.EMP, A.St,
CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
FROM Step3 A
OUTER APPLY
(
Select top 1 * FROM Step3 B
WHERE
emp = A.emp
AND
THE_ORDER >= A.THE_ORDER
AND .[Final_decision] = 'END_OF_SPAN'
ORDER BY THE_ORDER
)D
WHERE [A].[FINAL_DECISION] = 'START_OF_SPAN'
But now it is starting to feel like semantics without adding much to this conversation.
To make my code work with your new data set and give the output you have, CTE D needs to be changed to:
[D]
AS ( SELECT
[C].[emp]
, [C].[St]
, [C].[Et]
, [C].[DIFF]
, [C].[minStart]
, CASE
WHEN ([C].[St] = [C].[minStart] AND DIFF IS NULL) THEN
[C].[Et]
WHEN [C].[St] = [C].[minStart] THEN
[C].[St]
WHEN
(
[C].[DIFF] = 1
AND LAG([C].[DIFF]) OVER ( PARTITION BY
[C].[emp]
ORDER BY
[C].[St]
) > 1
) THEN
[C].[St]
WHEN [C].[DIFF] = 1 THEN
NULL
WHEN [C].[DIFF] IS NULL THEN
[C].[Et]
ELSE
[C].[Et]
END AS [daterange]
FROM
[C] )
and the final select needs to be changed to:
SELECT
[E].[emp]
, CASE
WHEN nextDate IS NULL
THEN [e].[St]
ELSE
[E].[daterange]
END AS st
, CASE WHEN nextDate IS NULL
THEN [E].daterange
ELSE
[E].[nextdate]
END AS enddt
FROM
[E]
WHERE
[E].[rn] % 2 != 0;
Not trying to pick apart your code; just seeing ways to make it a bit shorter. It just felt to me like the problem should be solvable with fewer CTEs. Even my version with 3 (poorly named) CTE's feels a bit large for this problem... Plus i have some unused columns in my CTE's. In CTE D, I only need to pull in St, and emp plus the calculated stuff. I don't need to pull in the other columns from C; they are just used for calculations. So my version could be optimized a bit as well.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
August 1, 2018 at 1:00 pm
bmg002 - Wednesday, August 1, 2018 12:51 PMmw_sql_developer - Wednesday, August 1, 2018 11:57 AMDone! Finally!
I tested against single spans as well... Works beautifully
If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
INSERT INTO #DLT( emp, St, Et, RSN )
VALUES
('0010970116','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20160701', '20170131', 'Member Eligibility Change'),
('0010970115','20170201', '20171130', 'Member Eligibility Change'),
('0010970115','20171201', '20171231', 'Member Eligibility Change'),
('0010970115','20180105', '20180531', 'Member Eligibility Change'),
('0010970115','20180601', '29991231', 'Member Eligibility Change');
With Step1 as
(
Select emp, St, Et
,LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
,ROW_NUMBER() OVER (PARTITION BY emp ORDER BY St) AS THE_ORDER
,DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF -- Gap in days
FROM
#DLT
)
,
Step2 as
(
Select *
,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
,LAG(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PreDiff
,LEAD(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PostDiff
FROM
Step1
)
,
Step3 as
(
Select
*
,
CASE WHEN (PreDiff is NULL ) OR (PreDiff <> 1 ) THEN
'START_OF_SPAN'
ELSE
CASE WHEN DEC IS NULL THEN 'END_OF_SPAN' ELSE
'YOU ARE INSIDE A SPAN'
END
END as FINAL_DECISIONFROM Step2
)
,
Step4 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'START_OF_SPAN'
)
,
Step5 as
(
Select
*
FROM Step3 WHERE FINAL_DECISION = 'END_OF_SPAN'
),
Step6 as
(
Select A.EMP, A.St,
CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
FROM Step4 A
OUTER APPLY
(
Select top 1 * FROM Step5
WHERE
emp = A.emp
AND
THE_ORDER >= A.THE_ORDER
ORDER BY THE_ORDER
)D)
Select * FROM Step6Discovered a bug in my version of this - a single row customer doesn't work and my output was in the form "<start date> - <end date>" which is what I thought you wanted based on your original post.
Looking through your code, there is a little bit of tidying up you could do, no? Like are you using "PostDiff"? and you could likely remove steps 4, 5 and 6 if you change your final select to:
Select A.EMP, A.St,
CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
FROM Step3 A
OUTER APPLY
(
Select top 1 * FROM Step3 B
WHERE
emp = A.emp
AND
THE_ORDER >= A.THE_ORDER
AND .[Final_decision] = 'END_OF_SPAN'
ORDER BY THE_ORDER
)D
WHERE [A].[FINAL_DECISION] = 'START_OF_SPAN'
But now it is starting to feel like semantics without adding much to this conversation.
To make my code work with your new data set and give the output you have, CTE D needs to be changed to:
[D]
AS ( SELECT
[C].[emp]
, [C].[St]
, [C].[Et]
, [C].[DIFF]
, [C].[minStart]
, CASE
WHEN ([C].[St] = [C].[minStart] AND DIFF IS NULL) THEN
[C].[Et]
WHEN [C].[St] = [C].[minStart] THEN
[C].[St]
WHEN
(
[C].[DIFF] = 1
AND LAG([C].[DIFF]) OVER ( PARTITION BY
[C].[emp]
ORDER BY
[C].[St]
) > 1
) THEN
[C].[St]
WHEN [C].[DIFF] = 1 THEN
NULL
WHEN [C].[DIFF] IS NULL THEN
[C].[Et]
ELSE
[C].[Et]
END AS [daterange]
FROM
[C] )and the final select needs to be changed to:
SELECT
[E].[emp]
, CASE
WHEN nextDate IS NULL
THEN [e].[St]
ELSE
[E].[daterange]
END AS st
, CASE WHEN nextDate IS NULL
THEN [E].daterange
ELSE
[E].[nextdate]
END AS enddt
FROM
[E]
WHERE
[E].[rn] % 2 != 0;Not trying to pick apart your code; just seeing ways to make it a bit shorter. It just felt to me like the problem should be solvable with fewer CTEs. Even my version with 3 (poorly named) CTE's feels a bit large for this problem... Plus i have some unused columns in my CTE's. In CTE D, I only need to pull in St, and emp plus the calculated stuff. I don't need to pull in the other columns from C; they are just used for calculations. So my version could be optimized a bit as well.
I purposely made it to have many CTEs. I could have done it in a few .. Just for clarity I like to have many CTEs. BTW - It does handle single spans
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply