May 20, 2004 at 9:24 am
Greeting!
I have a table with trend data that is usually collected on 5 second intervals, and it is put into the database by a program that is run once a day. I wrote a stored procedure to sift through the data and find missing *days*, but it churns way too much for me (each day has about 3k rows of data, and all I want is a "did May 10 happen").
The report looks like this (200 second later):
Fairburn 2003-01-01 00:00:00.000 NULL
Fairburn 2004-02-25 00:00:00.000 1
Fairburn 2004-05-20 00:00:00.000 NULL
Here's the code:
create table #t1 (col int IDENTITY(1,1), dt datetime)
create table #t2 (col int IDENTITY(2,1), dt datetime)
create table #t3 (PGID int, StartDate datetime, DaysMissing int)
insert into #t1 (dt)
SELECT distinct cast(convert(varchar(10), point_data_time, 101) as datetime)
FROM point_data inner join point_definition on point_data.point_definitionID = point_definition.point_definitionID
WHERE point_groupID in (select * from dbo.sublevel(@TopPgid)) order by cast(convert(varchar(10), point_data_time, 101) as datetime)
insert into #t2 (dt)
SELECT dt from #t1 order by dt
insert into #t3
select @toppgid, min(#t2.dt) as StartDate, NULL as DaysMissing from #t2
UNION
select @toppgid, #t2.dt as startDate, datediff(dd,#t2.dt,#t1.dt)-1 as DaysMissing from #t1 left join #t2 on #t1.col = #t2.col where datediff(dd,#t2.dt,#t1.dt)-1>0
UNION
select @toppgid, max(#t2.dt) as StartDate, NULL as DaysMissing from #t2
delete from #t1
delete from #t2
Fetch next from cur into @toppgid
END
select Group_Name as Site, StartDate, DaysMissing from #T3 inner join point_groups on #T3.PGID=point_groups.point_groupid
drop table #t1
drop table #t2
drop table #t3
TIA,
Thor Johnson
May 21, 2004 at 4:53 am
Hello Thor,
I don't fully understand the structure and the way you want to display missing days, but I'll try. I see there is some cursor used to fetch rows... well, that in itself could slow down the process. I try to avoid cursors whenever possible, are you sure you need it here?
If the data in #t1 are ordered by dt ASC, it would be possible to skip the #t2, if you use a self join on #t1, something like that :
select #t1.* , datediff(dd,q.dt, #t1.dt)-1
from #t1
JOIN #t1 q ON q.col = #t1.col-1
where datediff(dd,q.dt, #t1.dt) > 1
Otherwise, I guess that it should be possible to find a better way... First thing I would try is to create a reference table (e.g. dbo.calendar) containing all possible dates, and then make a left join from it to #t1... or maybe directly to your source data table? That depends, try which is better for you.
SELECT cal.dt
FROM dbo.calendar cal
LEFT JOIN #t1 ON cal.dt = #t1.dt
WHERE #t1.col IS NULL
Hope that will help a bit, although I'm not sure I correctly understood everything in your description... I just tried to hint, in which direction to look, not to solve the problem - I would need more info even to attempt that . Cheers, Vladan
May 26, 2004 at 9:42 pm
Hi Thor.
Vladan makes a good point about not understanding your requirements here. I don't understand them either. It's helpful to all of us when you are posting a question like this to make your problem and requirements as clear as possible; and it's also helpful to include not only the table definitions related to the question, but also the corresponding inserts that load test data into the tables. That makes all of our jobs faster and easier.
Hope to see a clarification in reply so we can help you.
Eaton
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply