July 1, 2009 at 8:23 pm
Hi All,
I have an interesting problem where I need to find the longest period of consecutive days from a group of dates. So, say we had the following dates:
1 Jan 2009
3 Jan 2009
4 Jan 2009
5 Jan 2009
9 Jan 2009
10 Jan 2009
I'd like to end up with 3 result rows with the date ranges:
1 Jan 2009 - 1 Jan 2009 (i.e. 1 day),
3 Jan 2009 - 5 Jan 2009 (i.e. 3 days), and
9 Jan 2009 - 10 Jan 2009 (2 days)
I've already come up with a solution, but I'm not convinced it's the best way of doing it (see attached code).
SET NOCOUNT ON
CREATE TABLE test1 (id1 INT, DOS datetime)
CREATE TABLE test2 (id1 INT, StartDate DATETIME, EndDate datetime, Days int)
INSERT test1 VALUES (1, '1 Feb 2009')
INSERT test1 VALUES (1, '3 Feb 2009')
INSERT test1 VALUES (1, '4 Feb 2009')
INSERT test1 VALUES (1, '5 Feb 2009')
INSERT test1 VALUES (1, '9 Feb 2009')
INSERT test1 VALUES (1, '10 Feb 2009')
INSERT test1 VALUES (2, '7 Mar 2009')
INSERT test1 VALUES (2, '3 Feb 2009')
INSERT test1 VALUES (2, '4 Feb 2009')
INSERT test1 VALUES (2, '8 Mar 2009')
INSERT test1 VALUES (2, '9 Jun 2009')
INSERT test1 VALUES (2, '11 Jun 2009')
SET NOCOUNT OFF
SELECT t1.id1, DATEDIFF(d, t1.dos, t2.dos) + 1 AS Days, t1.dos AS StartDate, t2.dos AS EndDate
INTO #DayCalc
FROM test1 AS t1 INNER JOIN test1 AS t2
ON t1.id1 = t2.id1
AND t1.DOS <= t2.DOS
WHERE DATEDIFF(d, t1.dos, t2.dos) + 1 = (SELECT COUNT(*)
FROM test1 AS a
WHERE a.id1 = t1.id1
AND a.dos BETWEEN t1.dos AND t2.dos)
INSERT Test2
SELECT id1, StartDate, MAX(EndDate) AS EndDate, MAX(Days) AS Days
FROM #DayCalc
GROUP BY id1, StartDate
DROP TABLE #DayCalc
DELETE FROM t1
FROM test2 AS t1 INNER JOIN test2 AS t2
ON t1.id1 = t2.id1
AND t1.EndDate = t2.EndDate
AND t1.StartDate > t2.StartDate
SELECT *
FROM test2
ORDER BY id1, StartDate
DROP TABLE test1
DROP TABLE test2
I know I haven't created any indexes on the above tables, but with such a small dataset it doesn't really matter. The real tables will have appropriate indexes. And the equivalent table to test1 in my example will also have around 20K rows, which is the reason I want to make sure it's a good way of doing what I need.
So, my question is: can anyone see a better way of doing this?
Thanks for any help you may provide,
Frank
July 2, 2009 at 1:09 am
July 2, 2009 at 1:20 am
Hi Dave,
Yes, that's an elegant solution, however, I'm running this on SQL Server 2000, so...
Frank
July 2, 2009 at 1:25 am
In that case you could try inserting to a temp table with an identity column rather than using row_number(). Havent tried it myself but no reason why it shouldnt be a fairly straight forward substitution.
July 2, 2009 at 3:12 am
Try this
SELECT a.id1,
a.DOS AS StartDate,
MIN(c.DOS) AS EndDate,
DATEDIFF(d, a.DOS, MIN(c.DOS))+1 AS Days
FROM test1 a
INNER JOIN test1 c ON c.id1=a.id1 AND c.DOS>=a.DOS
AND NOT EXISTS (SELECT * FROM test1 d WHERE d.id1=c.id1 AND d.DOS=c.DOS+1)
WHERE NOT EXISTS (SELECT * FROM test1 b WHERE b.id1=a.id1 AND b.DOS=a.DOS-1)
GROUP BY a.id1,a.DOS
ORDER BY a.id1,a.DOS
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537July 2, 2009 at 3:42 am
I followed Dave blog and wrote this query.
#1)
create table Contig(Id integer)
go
insert into contig values(1)
insert into contig values(2)
insert into contig values(3)
insert into contig values(5)
insert into contig values(6)
insert into contig values(7)
insert into contig values(8)
Select Id,identity(int,1,1) as seq
into #contig
from Contig
order by Id desc
Select Min(Id),Max(Id) from #contig
group by Id+seq order by 1
#2)
Drop Table ContigDates
go
Create Table ContigDates
(
DateCol smalldatetime
)
go
insert into ContigDates(DateCol) values('01jan2009')
insert into ContigDates(DateCol) values('02jan2009')
insert into ContigDates(DateCol) values('03jan2009')
insert into ContigDates(DateCol) values('01feb2009')
insert into ContigDates(DateCol) values('02feb2009')
insert into ContigDates(DateCol) values('03feb2009')
insert into ContigDates(DateCol) values('10feb2009')
go
select DateCol, identity(int,1,1) as Datecol_Group
into #ContigDates
from ContigDates
order by 1 desc
go
Select Min(DateCol),Max(DateCol)
from (select DateCol,DateCol+ Datecol_Group as grouping
from #ContigDates)a
group by grouping
order by 1
go
karthik
July 2, 2009 at 11:20 pm
Thanks for your help guys - I'll try out some of your suggestions and see which option works out to be the most efficient.
Cheers,
Frank
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply