January 14, 2014 at 11:22 am
I'm hoping that someone has some ninja date analysis skills that can lend some suggestions on this problem I'm working on. I have a table of employee assignments that I'm narrowing down to a specific group. Employees have multiple assignments (job positions essentially) and each has start and end dates. I need to identify the date range(s) that are covered by the assignments in my data set. Here's an example of data for one person ID. NULL in the end_date just means it is an active assignment. The pos_id column isn't necessary, but it helps define the data I'm looking at. The end result won't use it.
IDCOMPANYPOS_IDSTART_DATEEND_DATE
999119/2/20119/9/2012
999119/10/20129/10/2012
999119/11/20129/11/2012
999119/12/20126/2/2013
999116/3/20136/30/2013
999117/1/20138/31/2013
999119/1/2013NULL
999129/1/20129/10/2012
999129/11/20129/11/2012
999129/12/20126/30/2013
999127/1/20138/31/2013
999129/1/20139/1/2013
999121/6/2014NULL
999139/2/20119/9/2012
999139/10/20129/10/2012
999139/11/201211/28/2012
9991311/29/201211/29/2012
999139/1/2013NULL
In this case I want results to say that ID 999 has a range from 9/2/2011 to NULL. There are no gaps in the date range. Or to say it differently, there's always an assignment starting the next day after an end_date. Or an assignment that overlaps the end and beginning of another assignment.
Here's another example where there is a gap in the ranges.
IDCOMPANYPOS_IDSTART_DATEEND_DATE
333112011-09-022012-08-31
333112012-09-012012-09-10
333112012-09-112012-09-11
333112012-09-122013-01-06
333112013-09-01NULL
There would be 2 result rows for this with a range from 2011-09-02 to 2013-01-06 and a second for 2013-09-01 to NULL.
The end result would be to have a row per date range for every ID. I've written a script that will find the ranges but it is a painful RBAR solution that takes forever to run. Every different path I've gone down trying to solve it ends in evaluating row by row grouped by the Person ID. The assignments are not always continuous, so I can't use a MAX and MIN and there may be 1 or more gaps in the dates to account for.
Originally, I was hoping to make this into a view but I'm not sure that will be possible. In case this is helpful this data will be accessed for reporting and other business processes.
Any ideas would be most welcome. Thanks!!
January 14, 2014 at 11:29 am
January 14, 2014 at 11:42 am
Does this article help you?
January 14, 2014 at 11:44 am
Here it is.
CREATE TABLE #POSDATES (
IDINT
COMPANYTINYINT
POS_IDTINYINT
[START_DATE]DATE
END_DATEDATE
)
INSERT INTO [#POSDATES]([ID],[COMPANY],[POS_ID],[START_DATE],[END_DATE])
VALUES
(333,1,1,CAST('20110902' as DATE),CAST('20120831' as DATE)),
(333,1,1,CAST('20120901' as DATE),CAST('20120910' as DATE)),
(333,1,1,CAST('20120911' as DATE),CAST('20120911' as DATE)),
(333,1,1,CAST('20120912' as DATE),CAST('20130106' as DATE)),
(333,1,1,CAST('20130901' as DATE),NULL),
(999,1,1,CAST('20110902' as DATE),CAST('20120909' as DATE)),
(999,1,1,CAST('20120910' as DATE),CAST('20120910' as DATE)),
(999,1,1,CAST('20120911' as DATE),CAST('20120911' as DATE)),
(999,1,1,CAST('20120912' as DATE),CAST('20130602' as DATE)),
(999,1,1,CAST('20130603' as DATE),CAST('20130630' as DATE)),
(999,1,1,CAST('20130701' as DATE),CAST('20130831' as DATE)),
(999,1,1,CAST('20130901' as DATE),NULL),
(999,1,2,CAST('20120901' as DATE),CAST('20120910' as DATE)),
(999,1,2,CAST('20120911' as DATE),CAST('20120911' as DATE)),
(999,1,2,CAST('20120912' as DATE),CAST('20130630' as DATE)),
(999,1,2,CAST('20130701' as DATE),CAST('20130831' as DATE)),
(999,1,2,CAST('20130901' as DATE),CAST('20130901' as DATE)),
(999,1,2,CAST('20140106' as DATE),NULL),
(999,1,3,CAST('20110902' as DATE),CAST('20120909' as DATE)),
(999,1,3,CAST('20120910' as DATE),CAST('20120910' as DATE)),
(999,1,3,CAST('20120911' as DATE),CAST('20121128' as DATE)),
(999,1,3,CAST('20121129' as DATE),CAST('20121129' as DATE)),
(999,1,3,CAST('20130901' as DATE),NULL)
January 14, 2014 at 1:44 pm
Luis Cazares (1/14/2014)
Does this article help you?
That may do it, but I'll have to fill in all the days in between the start & end. Then plug those dates into the solution in that article. Time to play with it!
January 14, 2014 at 3:41 pm
WOOHOO! That did it!
I unpivoted the start & end date into one date column. (had to group them to eliminate duplicates for some records) I joined up to a date table to fill in the ranges and then used the code from that article.
Thanks!
January 14, 2014 at 3:53 pm
January 14, 2014 at 3:56 pm
That's great. Even if I only pointed out the article.
Under normal conditions, I would have given a try to give you a coded solution but today I can't think straight.
January 14, 2014 at 8:50 pm
JackG (1/14/2014)
WOOHOO! That did it!I unpivoted the start & end date into one date column. (had to group them to eliminate duplicates for some records) I joined up to a date table to fill in the ranges and then used the code from that article.
Thanks!
I hope you used the CROSS APPLY VALUES approach to UNPIVOT.
You can read about it in the first article in my signature links.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 14, 2014 at 8:57 pm
I hope you used the CROSS APPLY VALUES approach to UNPIVOT.
You can read about it in the first article in my signature links.
I did do unpivot. I'll check out the cross apply. I need practice on cross apply, I never have been very good at them.
January 14, 2014 at 9:06 pm
JackG (1/14/2014)
I hope you used the CROSS APPLY VALUES approach to UNPIVOT.
You can read about it in the first article in my signature links.
I did do unpivot. I'll check out the cross apply. I need practice on cross apply, I never have been very good at them.
If you need a better understanding of CROSS APPLY, these are the seminal articles on that subject:
Understanding and Using APPLY (Part 1) [/url]
Understanding and Using APPLY (Part 2) [/url]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 15, 2014 at 9:12 am
Here is my solution.
;with cte_DateFill as (
SELECT xD.[Date]
,xpa.ID
,xpa.company
FROM #PosDates xPA
JOIN @Date xD ON xD.DATE BETWEEN xPA.START_DATEAND ISNULL(xPA.END_DATE, GETDATE())
GROUP BY xd.[Date], xPA.ID, xPA.COMPANY
)
,cte_Groups as (
SELECT ID
,Company
,[Date]
,DateGroup = DATEADD(dd, - ROW_NUMBER() OVER (Partition by ID, Company
ORDER BY ID, Company, [Date]), [Date])
FROM cte_DateFill
)
SELECT ID
,Company
,STARTDATE = MIN([Date])
,ENDDATE = IIF(MAX([Date]) = CONVERT(date,getdate()), NULL, MAX([Date]))
FROM cte_Groups
GROUP BY ID, DATEGROUP, Company
ORDER BY ID, STARTDATE
The @Date table variable is just a temporary solution to having a date table. It is getting populated with Days from the MIN(Start_Date) out of the data to Today. I'll create something a little more permanent and wider range. In the final select I replaced the enddate with NULL if it equaled todays date because the NULL enddate means it is an active position in this data.
Thanks for the input! It helped a lot!
<edit> I forgot to mention that I changed some things so I didn't need to unpivot the data. 😀
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply