November 6, 2008 at 12:39 pm
Hello, I'm trying to figure out how to run a rowset operation to do the following:
I have the following table:
create table #temp
(
user_id_num int,
insert_date datetime,
)
I want to create a query to see if there is a 24 hour period in this table where any user has more than 50 records inserted. Note: 24 hours doesn't necessarily mean an individual day - it could be 11/3/2008 12:37 - 11/4/2008 12:37.
I want to avoid creating a subquery for every row in the table, and with all the new additions in T-SQL for 2005, I have to think there's a better way - I just can't figure it out. And of course, no cursors...
November 6, 2008 at 2:40 pm
Interesting question. I am not sure of the best way to do this either but I will give it some more thought. But my initial thought is just to do a self join on the table with a dateadd function and group the results. But the fact that the 24hr period is always moving means you would get results twice for example if someone logged in every minute for 51 minutes. You would get that person for the 24 hour period, and then the 24 hour period that started on minute after the first.
Sample code/illustration.
SET NOCOUNT ON
DECLARE @T AS TABLE (
User_Id_Num INT,
Insert_Date DATETIME
)
DECLARE @ctr AS INT
-- 50 Logins every minute so should show
SET @Ctr = 0
WHILE @Ctr < 50
BEGIN
INSERT INTO @T
VALUES (1, DATEADD(minute, @ctr, GETDATE()))
SET @Ctr = @Ctr + 1
END
-- 50 Logins in hours, so should not be there
SET @Ctr = 0
WHILE @Ctr < 50
BEGIN
INSERT INTO @T
VALUES (2, DATEADD(hour, @ctr, GETDATE()))
SET @Ctr = @Ctr + 1
END
-- 40 in 40 minutes so should not show up
SET @Ctr = 0
WHILE @Ctr < 40
BEGIN
INSERT INTO @T
VALUES (3, DATEADD(minute, @ctr, GETDATE()))
SET @Ctr = @Ctr + 1
END
-- If 51 logins, then three lists of 50 logins in 24 hrs
-- because the period is always moving.
SET @Ctr = 0
WHILE @Ctr < 51
BEGIN
INSERT INTO @T
VALUES (4, DATEADD(minute, @ctr, GETDATE()))
SET @Ctr = @Ctr + 1
END
SELECT T1.User_Id_Num,
T1.Insert_Date
FROM @T T1 INNER JOIN @T T2 ON
(T1.User_Id_Num = T2.User_Id_Num)
WHERE T2.Insert_Date BETWEEN
T1.Insert_Date AND DATEADD(HOUR, 24, T1.Insert_Date)
GROUP BY T1.User_Id_Num, T1.Insert_Date
HAVING COUNT(*) >= 50
So as I said, this works but probably a better way. If I think of something else I'll let you know.
November 7, 2008 at 6:47 am
I thought of a different way to do this, but I'd like to test it first.
Can you please post some sample data?
November 7, 2008 at 8:05 am
I wonder, would a SELF JOIN and use DATEDIFF or <= operator work?
it would be more SET BASED
SELECT T1.user_id_num
FROM #temp T1
INNER JOIN #temp T2 ON DATEDIFF(hh, T1.insert_date DATETIME, T2.insert_date DATETIME) <= 24
--DATEADD(hh,24,T1.insert_date datetime) <= T2.insert_date datetime
GROUP BY user_id_num
HAVING COUNT(1) > 50
November 7, 2008 at 9:46 am
Well I am allowed some margin of error with the time range, so this turns out to be the best solution I came up with:
First aggregate the data:
declare @time_divisor float
select @time_divisor = 96 --Set to group by every 15 minutes (1 day/96)
select @time_divisor = 1/@time_divisor
select user_id_num,
Cast(Floor(Cast(insert_date as float) / @time_divisor) * @time_divisor + .00000001 as smalldatetime) time_interval,
Count(*) total_requests
into #agg_data
from #temp
group by user_id_num,
Cast(Floor(Cast(insert_date as float) / @time_divisor) * @time_divisor + .00000001 as smalldatetime)
Then remove data that I know isn't eligible (which is half of the records)
create clustered index ix_tmp on #agg_data (user_id_num, time_interval)
delete from #agg_data
where user_id_num IN
(select user_id_num
from #temp
group by user_id_num having Sum(total_requests) < 50)
Finally get the results
select distinct t1.*,
Sum(t2.total_requests)
OVER (PARTITION BY t1.user_id_num, t1.time_interval) requests_in_last_24_hours
into #final_table
from #agg_data t1, #agg_data t2
where t1.user_id_num = t2.user_id_num
and t2.time_interval > t1.time_interval - 1
and t2.time_interval <= t1.time_interval
OPTION (MAXDOP 1)
select * from #final_table
where requests_in_last_24_hours >= 50
order by user_id_num, time_interval desc
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply