December 29, 2009 at 1:06 pm
I am stuck on this one for a set based solution, and I think I have it almost working, something is escaping me.
I have a table with a group id, and a series of start and end dates, each span of dates being it's own record. The dates are not always consecutive, and there can be gaps of any length between one end date and another start date. There can not be overlaping date spans within a Group Id.
For the requirements, I need to arrive at one record for each group, with the latest end date the group has, and the earliest continously covered start date. A difference between one start date and the next end date of one day is considered continuous, while any other difference is considered a break.
So quick example, for below
Group_IDStart_DateEnd_Date
AAA2005010120050930
AAA2005100199991231
BBB2005010120061231
BBB2007010120070630
BBB2008010120091231
I need the results to be
Group_IDStart_DateEnd_Date
AAA2005010199991231
BBB2008010120091231
Attached are table defs, sample data, desired result data, and what I have reached so far. I can get a compare between first and second records for a given group, yet I have not managed to imagine how to then take that result forward some unknown number of times.
Any help would be greatly appreciated. :w00t:
December 29, 2009 at 7:18 pm
Look not for continuous spans, looks for breaks longer than 1 day.
The end of the latest of such breaks will be the beginning of the latest continuous span.
_____________
Code for TallyGenerator
December 30, 2009 at 1:19 am
Hi,
Take a look at my blog entry here.
If you are on 7 or 2000 , you will have to use an identity column but the theory is the same.
December 30, 2009 at 8:11 am
Thanks to both of you. I should get sometime to work on this again today, when I get this I am going to have to remember to write it down somewhere, strikes me as one of those things which will show up again someday.
December 31, 2009 at 5:42 am
Dave Ballantyne (12/30/2009)
Hi,Take a look at my blog entry here.
If you are on 7 or 2000 , you will have to use an identity column but the theory is the same.
Nice Dave. Thanks for that.
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
December 31, 2009 at 9:39 pm
David Lester (12/29/2009)
I am stuck on this one for a set based solution, and I think I have it almost working, something is escaping me.I have a table with a group id, and a series of start and end dates, each span of dates being it's own record. The dates are not always consecutive, and there can be gaps of any length between one end date and another start date. There can not be overlaping date spans within a Group Id.
For the requirements, I need to arrive at one record for each group, with the latest end date the group has, and the earliest continously covered start date. A difference between one start date and the next end date of one day is considered continuous, while any other difference is considered a break.
So quick example, for below
Group_IDStart_DateEnd_Date
AAA2005010120050930
AAA2005100199991231
BBB2005010120061231
BBB2007010120070630
BBB2008010120091231
I need the results to be
Group_IDStart_DateEnd_Date
AAA2005010199991231
BBB2008010120091231
Attached are table defs, sample data, desired result data, and what I have reached so far. I can get a compare between first and second records for a given group, yet I have not managed to imagine how to then take that result forward some unknown number of times.
Any help would be greatly appreciated. :w00t:
I agree... "any help" would be greatly appreciated. For better, quicker answers, please read and heed the information located at the article in the first link in my signature below. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 4, 2010 at 8:06 am
Hi Jeff, I see I forgot to mention in my post the attached file has the table defs, and data create statements etc.
Today I will be working on this, and I believe I have the right direction after reading the above blog entry. Now just to make sure I get it working the right way.
January 4, 2010 at 8:59 am
Try this, not particularly efficient though.
WITH Start_Dates AS(
SELECT s1.Group_ID,s1.Start_Date
FROM Source_Data s1
WHERE NOT EXISTS(SELECT * FROM Source_Data s2
WHERE s1.Group_ID=s2.Group_ID
AND s1.Start_Date > s2.Start_Date AND CAST(s1.Start_Date AS DATETIME) <= CAST(s2.End_Date AS DATETIME)+1)
),
End_Dates AS (
SELECT t1.Group_ID,t1.End_Date
FROM Source_Data t1
WHERE NOT EXISTS(SELECT * FROM Source_Data t2
WHERE t1.Group_ID=t2.Group_ID
AND CAST(t1.End_Date AS DATETIME) >= CAST(t2.Start_Date AS DATETIME)-1 AND t1.End_Date < t2.End_Date)
)
SELECT s1.Group_ID,s1.Start_Date,MIN(t1.End_Date) AS End_Date
FROM Start_Dates s1
INNER JOIN End_Dates t1 ON t1.Group_ID=s1.Group_ID
AND s1.Start_Date<t1.End_Date
GROUP BY s1.Group_ID,s1.Start_Date
ORDER BY s1.Group_ID,s1.Start_Date;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537January 4, 2010 at 11:34 am
Mark-101232 (1/4/2010)
Try this, not particularly efficient though.
WITH Start_Dates AS(
SELECT s1.Group_ID,s1.Start_Date
FROM Source_Data s1
WHERE NOT EXISTS(SELECT * FROM Source_Data s2
WHERE s1.Group_ID=s2.Group_ID
AND s1.Start_Date > s2.Start_Date AND CAST(s1.Start_Date AS DATETIME) <= CAST(s2.End_Date AS DATETIME)+1)
),
End_Dates AS (
SELECT t1.Group_ID,t1.End_Date
FROM Source_Data t1
WHERE NOT EXISTS(SELECT * FROM Source_Data t2
WHERE t1.Group_ID=t2.Group_ID
AND CAST(t1.End_Date AS DATETIME) >= CAST(t2.Start_Date AS DATETIME)-1 AND t1.End_Date < t2.End_Date)
)
SELECT s1.Group_ID,s1.Start_Date,MIN(t1.End_Date) AS End_Date
FROM Start_Dates s1
INNER JOIN End_Dates t1 ON t1.Group_ID=s1.Group_ID
AND s1.Start_Date<t1.End_Date
GROUP BY s1.Group_ID,s1.Start_Date
ORDER BY s1.Group_ID,s1.Start_Date;
Oops, should be for SQL Server 2000
SELECT s1.Group_ID,s1.Start_Date,MIN(t1.End_Date) AS End_Date
FROM (
SELECT s1.Group_ID,s1.Start_Date
FROM Source_Data s1
WHERE NOT EXISTS(SELECT * FROM Source_Data s2
WHERE s1.Group_ID=s2.Group_ID
AND s1.Start_Date > s2.Start_Date AND CAST(s1.Start_Date AS DATETIME) <= CAST(s2.End_Date AS DATETIME)+1)
) s1
INNER JOIN (
SELECT t1.Group_ID,t1.End_Date
FROM Source_Data t1
WHERE NOT EXISTS(SELECT * FROM Source_Data t2
WHERE t1.Group_ID=t2.Group_ID
AND CAST(t1.End_Date AS DATETIME) >= CAST(t2.Start_Date AS DATETIME)-1 AND t1.End_Date < t2.End_Date)
) t1 ON t1.Group_ID=s1.Group_ID
AND s1.Start_Date<t1.End_Date
GROUP BY s1.Group_ID,s1.Start_Date
ORDER BY s1.Group_ID,s1.Start_Date
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537January 4, 2010 at 12:28 pm
Thank you Mark, this works well. It is what I was initially attempting to get to, I was not getting the sub-queries correctly.
I will have to file this one away as a reminder.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply