Finding open records over time without loop

  • Greetings SQL gurus-- I have a table structure that has starting and ending date columns and need to find out the number of records that were currently open by hour of the day over the past 3 or so years. So far I keep coming back to the brute force approach. Start at my initial time, count the open records at that point, use DATEADD to move to the next hour, and repeat about 1100 times.

    It seems like there must be a more elegant solution, but it hasn't come to me yet. Anyone out there have any other ideas?

    -Paul

  • Please post your table structure, sample data, expected results from the sample data, and any code you have put together so far so we can help you.


  • In addition to MrPolecat's request - define "currently open". How does one determine what "open" is, what "current" means? (No - I'm not channeling Bill clinton - just trying to get some actual specifics).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Without specifics on the table, I can't give you a complete solution, but I can recommend that you look into using a Numbers table instead of brute force.

    (In case you don't know, a Numbers table is just a table of consecutive numbers.)

    Since there are 8766 hours per year (averaged for leap years), and you have 3 years of data, a Numbers table with 90,000 rows will do for what you need. Let's make it 100k, just because.

    create table dbo.Numbers(

    Number int identity(0,1) primary key,

    Placeholder tinyint)

    go

    insert into dbo.Numbers(placeholder)

    select top 100001 1

    from sys.all_objects

    cross join sys.all_objects

    Then you can do something like this:

    select dateadd(hour, number, '1/1/2005') as Entered, count(*)

    from dbo.Numbers

    inner join dbo.YourTable

    on CreatedDate >= dateadd(hour, number, '1/1/2005')

    and CreatedDate < dateadd(hour, number + 1, '1/1/2005')

    group by dateadd(hour, number, '1/1/2005')

    order by dateadd(hour, number, '1/1/2005')

    That will give you a list of hours and how many rows from YourTable have a CreatedDate value in that hour.

    You'll have to modify this to fit your particular scenario, but the basic idea should work.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Sorry about not including the details to start with. I've been lurking around here long enough to know better. Tables and sample data are below with code at the very bottom. The gist of what I'm trying to accomplish is find rows where at a given time (at the top of each hour) a visit was still in progress, meaning the StartDate is less than the date I'm checking for and the EndDate is later or is NULL. The final output would be the Sum of the visits at each hour of the day. My goal is to not have to check for each hour individually over the past 3 years, and was trying to come up with a way to use DATEPART to find all the rows that are open during hour 0, 1, 2, 3, etc. for any date.

    The actual data is ~500k rows.

    Let me know if any more details would help.

    Thanks,

    Paul

    --===== If the test table already exists, drop it

    IF OBJECT_ID('tempdb..#Visits','U') IS NOT NULL

    DROP TABLE #Visits

    IF OBJECT_ID('tempdb..#VisitCounts','U') IS NOT NULL

    DROP TABLE #VisitCounts

    --===== Create the test table with

    CREATE TABLE #Visits

    (

    VisitID INT PRIMARY KEY CLUSTERED,

    StartDate DATETIME,

    EndDate DATETIME

    )

    CREATE TABLE #VisitCounts

    (

    CensusDate DATETIME,

    NumVisits INT

    )

    --Sample Data

    SELECT '478488','Jan 3 2005 6:25AM','Jan 6 2005 1:15PM' UNION ALL

    SELECT '483511','Jan 3 2005 9:00AM','Jan 6 2005 1:15PM' UNION ALL

    SELECT '483832','Jan 1 2005 7:53AM','Jan 2 2005 11:15AM' UNION ALL

    SELECT '483866','Jan 3 2005 10:02AM','Jan 3 2005 2:58PM' UNION ALL

    SELECT '483880','Jan 2 2005 12:16PM','Jan 3 2005 9:00AM' UNION ALL

    SELECT '483918','Jan 2 2005 8:23PM','Jan 5 2005 12:45PM' UNION ALL

    SELECT '483975','Jan 3 2005 2:25PM','Jan 4 2005 10:00AM' UNION ALL

    SELECT '484026','Jan 3 2005 1:00PM','Jan 12 2005 1:00PM' UNION ALL

    SELECT '484027','Jan 3 2005 5:00PM','Jan 5 2005 11:00AM' UNION ALL

    SELECT '484126','Jan 4 2005 5:48AM','Jan 7 2005 2:14PM' UNION ALL

    SELECT '484275','Jan 3 2005 8:06PM','Jan 9 2005 6:30PM' UNION ALL

    SELECT '484326','Jan 4 2005 5:18PM','Jan 6 2005 10:20AM' UNION ALL

    SELECT '484351','Jan 5 2005 12:18AM','Jan 10 2005 10:30AM' UNION ALL

    SELECT '484352','Jan 4 2005 10:50PM','Jan 13 2005 3:20PM' UNION ALL

    SELECT '484355','Jan 5 2005 5:09AM','Jan 6 2005 10:40AM' UNION ALL

    SELECT '484471','Jan 5 2005 12:19PM','Jan 8 2005 2:40PM' UNION ALL

    SELECT '484594','Jan 12 2005 7:51AM','Jan 16 2005 1:35PM' UNION ALL

    SELECT '484690','Jan 12 2005 6:00AM','Jan 14 2005 3:40PM' UNION ALL

    SELECT '484732','Jan 6 2005 3:57PM','Jan 8 2005 1:10PM' UNION ALL

    SELECT '484743','Jan 6 2005 4:55PM','Jan 10 2005 12:30PM' UNION ALL

    SELECT '484763','Jan 6 2005 10:51PM','Jan 9 2005 2:20PM' UNION ALL

    SELECT '484771','Jan 6 2005 10:00PM','Jan 12 2005 12:30PM' UNION ALL

    SELECT '484770','Jan 6 2005 8:19PM','Jan 12 2005 12:30PM' UNION ALL

    SELECT '484773','Jan 6 2005 7:50PM','Jan 8 2005 1:10PM' UNION ALL

    SELECT '484775','Jan 6 2005 10:38PM','Jan 11 2005 6:45PM' UNION ALL

    SELECT '484984','Jan 8 2005 11:45PM','Jan 10 2005 4:50PM' UNION ALL

    SELECT '484986','Jan 9 2005 4:11AM','Jan 10 2005 5:45PM' UNION ALL

    SELECT '484988','Jan 9 2005 4:49AM','Jan 10 2005 1:38PM' UNION ALL

    SELECT '485007','Jan 9 2005 11:48AM','Jan 10 2005 1:20PM' UNION ALL

    SELECT '485019','Jan 9 2005 4:10PM','Jan 14 2005 11:30AM' UNION ALL

    SELECT '485024','Jan 9 2005 5:55PM','Jan 10 2005 12:30PM' UNION ALL

    SELECT '485028','Jan 9 2005 8:48PM','Jan 10 2005 6:40PM' UNION ALL

    SELECT '485035','Jan 9 2005 11:16PM','Jan 10 2005 6:40PM' UNION ALL

    SELECT '485036','Jan 10 2005 2:20AM','Jan 11 2005 11:15AM' UNION ALL

    SELECT '485045','Jan 10 2005 9:30AM','Jan 11 2005 1:50PM' UNION ALL

    SELECT '485150','Jan 10 2005 4:44PM','Jan 12 2005 9:30AM' UNION ALL

    SELECT '485189','Jan 10 2005 2:29PM','Jan 12 2005 8:46AM' UNION ALL

    SELECT '485227','Jan 10 2005 11:42PM','Jan 11 2005 11:30AM' UNION ALL

    SELECT '485228','Jan 10 2005 7:33PM','Jan 18 2005 1:50PM' UNION ALL

    SELECT '485232','Jan 11 2005 2:02AM','Jan 12 2005 4:20PM' UNION ALL

    SELECT '485326','Jan 13 2005 5:20PM','Jan 16 2005 2:00PM' UNION ALL

    SELECT '485338','Jan 13 2005 10:00AM','Jan 17 2005 12:30PM' UNION ALL

    SELECT '485349','Jan 19 2005 1:00PM','Jan 23 2005 1:30PM' UNION ALL

    SELECT '485373','Jan 19 2005 12:10PM','Jan 22 2005 12:00PM' UNION ALL

    SELECT '485435','Jan 11 2005 5:57PM','Jan 12 2005 5:00PM' UNION ALL

    SELECT '485443','Jan 11 2005 10:02PM','Jan 14 2005 10:45AM' UNION ALL

    SELECT '485622','Jan 12 2005 5:34PM','Jan 14 2005 2:00PM' UNION ALL

    SELECT '485628','Jan 12 2005 6:47PM','Jan 16 2005 1:50PM' UNION ALL

    SELECT '485638','Jan 13 2005 1:07AM','Jan 14 2005 10:13AM' UNION ALL

    SELECT '485639','Jan 13 2005 12:05AM','Jan 14 2005 4:45PM' UNION ALL

    SELECT '485675','Jan 17 2005 7:30AM','Jan 21 2005 10:15AM' UNION ALL

    SELECT '485761','Jan 13 2005 2:51PM','Jan 16 2005 5:30AM' UNION ALL

    SELECT '485772','Jan 14 2005 9:30PM','Jan 15 2005 12:30PM' UNION ALL

    SELECT '485914','Jan 18 2005 9:10AM','Jan 22 2005 2:20PM' UNION ALL

    SELECT '485931','Jan 2 2005 10:41AM','Jan 3 2005 2:15PM' UNION ALL

    SELECT '485953','Jan 14 2005 12:59PM','Jan 17 2005 10:10AM' UNION ALL

    SELECT '485962','Jan 19 2005 1:30PM','Jan 21 2005 11:00AM' UNION ALL

    SELECT '486024','Jan 14 2005 9:20PM','Jan 20 2005 9:15AM' UNION ALL

    SELECT '486025','Jan 14 2005 10:50PM','Jan 18 2005 12:15PM' UNION ALL

    SELECT '486084','Jan 16 2005 3:49AM','Jan 18 2005 10:15AM' UNION ALL

    SELECT '486089','Jan 16 2005 6:35AM','Jan 16 2005 1:15PM' UNION ALL

    SELECT '486100','Jan 16 2005 9:57AM','Jan 21 2005 10:15AM' UNION ALL

    SELECT '486109','Jan 16 2005 11:20AM','Jan 18 2005 12:05PM' UNION ALL

    SELECT '486117','Jan 16 2005 7:09PM','Jan 21 2005 2:45PM' UNION ALL

    SELECT '486122','Jan 16 2005 6:57PM','Jan 19 2005 10:25AM' UNION ALL

    SELECT '486237','Jan 31 2005 6:30AM','Feb 4 2005 6:20PM' UNION ALL

    SELECT '486320','Jan 17 2005 4:50PM','Jan 19 2005 1:00PM' UNION ALL

    SELECT '486350','Jan 17 2005 10:38PM','Jan 27 2005 4:30PM' UNION ALL

    SELECT '486420','Jan 18 2005 8:57AM','Jan 20 2005 9:15AM' UNION ALL

    SELECT '486423','Jan 19 2005 2:00PM','Jan 22 2005 2:36PM' UNION ALL

    SELECT '486533','Jan 20 2005 6:30AM','Jan 22 2005 11:17AM' UNION ALL

    SELECT '486544','Jan 18 2005 2:39PM','Jan 22 2005 1:00PM' UNION ALL

    SELECT '486577','Jan 18 2005 4:30PM','Jan 20 2005 3:20PM' UNION ALL

    SELECT '486609','Jan 18 2005 7:33PM','Jan 20 2005 9:30AM' UNION ALL

    SELECT '486673','Jan 19 2005 11:14AM','Jan 20 2005 7:00PM' UNION ALL

    SELECT '486733','Jan 19 2005 2:15PM','Jan 21 2005 8:30AM' UNION ALL

    SELECT '486757','Jan 27 2005 6:15AM','Jan 31 2005 2:00PM' UNION ALL

    SELECT '486800','Jan 19 2005 4:40PM','Jan 24 2005 1:48PM' UNION ALL

    SELECT '486815','Jan 19 2005 11:42PM','Jan 24 2005 2:25PM' UNION ALL

    SELECT '486824','Jan 20 2005 7:46AM','Jan 24 2005 11:30AM' UNION ALL

    SELECT '486964','Jan 25 2005 6:15AM','Jan 29 2005 12:15PM' UNION ALL

    SELECT '487062','Jan 21 2005 12:15AM','Jan 24 2005 11:46AM' UNION ALL

    SELECT '487066','Jan 21 2005 8:45AM','Jan 23 2005 12:15AM' UNION ALL

    SELECT '487157','Jan 21 2005 11:16AM','Jan 26 2005 12:00PM' UNION ALL

    SELECT '487269','Jan 21 2005 10:25PM','Jan 23 2005 5:10AM' UNION ALL

    SELECT '487273','Jan 21 2005 10:56PM','Jan 25 2005 10:30AM' UNION ALL

    SELECT '487284','Jan 22 2005 2:59AM','Jan 24 2005 8:15PM' UNION ALL

    SELECT '487314','Jan 22 2005 10:32AM','Jan 25 2005 11:45AM' UNION ALL

    SELECT '487329','Jan 22 2005 3:55PM','Jan 25 2005 11:40AM' UNION ALL

    SELECT '487346','Jan 22 2005 3:47PM','Jan 25 2005 1:20PM' UNION ALL

    SELECT '487348','Jan 22 2005 4:55PM','Jan 24 2005 6:15PM' UNION ALL

    SELECT '487360','Jan 23 2005 4:42AM','Jan 25 2005 11:00AM' UNION ALL

    SELECT '487392','Jan 23 2005 12:52PM','Jan 25 2005 1:20PM' UNION ALL

    SELECT '487399','Jan 23 2005 1:25PM','Jan 24 2005 6:15PM' UNION ALL

    SELECT '487406','Jan 23 2005 5:30PM','Jan 27 2005 10:20AM' UNION ALL

    SELECT '487411','Jan 24 2005 3:05AM','Jan 26 2005 11:00AM' UNION ALL

    SELECT '487412','Jan 24 2005 12:23AM','Jan 25 2005 11:30AM' UNION ALL

    SELECT '487454','Jan 24 2005 8:17AM',NULL UNION ALL

    SELECT '487589','Jan 26 2005 6:40AM',NULL UNION ALL

    DECLARE @censusdate DATETIME

    SET @censusdate = '20050101'

    WHILE @censusdate < GETDATE()

    BEGIN

    SELECT @censusdate, COUNT(*)

    INTO VisitCounts

    FROM Visits

    WHERE @censusdate > StartDate

    AND (@censusdate < EndDate or EndDate is NULL)

    SET @censusdate = DATEADD(hh, 1, @censusdate)

    CONTINUE

  • build your numbers table ( I call mine tally because Jeff Moden does:D. n is the field that hold the sequential numbers in Tally) like GSquared said.

    select count(visitid),dateadd(hh,n,'1/1/2005') from tally

    left join records on dateadd(hh,n,'1/1/2005') between startdate and coalesce(enddate,getdate())

    group by dateadd(hh,n,'1/1/2005')

    order by dateadd(hh,n,'1/1/2005')


  • Just one last question, based on the sample data provided, what is your expected output?

    😎

  • One of these days I'll actually recognize for myself when to use a tally table 😀 I have one built but didn't think to use it. Thanks for the help, I think I can run with it from here.

    -Paul

  • Lynn Pettis (4/7/2008)


    Just one last question, based on the sample data provided, what is your expected output?

    😎

    It would be along the lines of

    1/3/05 12:00:00AM - 4 open records

    1/3/05 1:00:00AM - 4 open records

    1/3/05 2:00:00AM - 5 open records

    .

    .

    .

    1/6/05 2:00:00AM - 2 open records

    Once I have the counts for each time of day, I will aggregate over the hour of the day for the past 3 years, i.e. might see something like 9000 open records at 2pm across all dates, 8500 open records at 11pm across all dates, etc.

  • mrpolecat (4/7/2008)


    build your numbers table ( I call mine tally because Jeff Moden does:D.

    Now, there's a heck of a compliment. 😀 Thanks, MrPoleCat!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • pauls2 (4/7/2008)


    One of these days I'll actually recognize for myself when to use a tally table 😀 I have one built but didn't think to use it. Thanks for the help, I think I can run with it from here.

    -Paul

    Outstanding... You're not alone... lots of other people don't think of using the Tally table for such a thing... if you get the chance, would you post the code you finally end up with so others can start to get the idea that a Tally table can be used for something other than "split" functions? Thanks a heap, Paul.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I tested mine with a million rows and it took a really long time to run. I only had a PK on the record number so an index on the dates may have helped. My tally table only had 11000 numbers but to handle 3 years worth of hours you will need about 30K numbers.


  • For what it's worth - the solutions advanced implicitly answer the question of "open" as "how many cases are currently not ended at this moment". Anything that would open and closed within the same hour would not be tallied anywhere.

    I'd think that if you used an "advanced tally", so as to incorporate the data calculation in it, you could do substantially better than joins on functions.

    Meaning -

    create table datestally (startdt datetime,

    enddt as dateadd(hour,1,startdt) persisted)

    create unique clustered index ix_datestally on datestally(startdt)

    insert datestally(startdt)

    select dateadd(hour,n,'1/1/1980') from tally where n<200000

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Good point Matt. In this particular case it doesn't matter because a) I only need to know what was open at the top of each hour, and b) it would be an exceedingly unusual case to have that happen with this data which represents hospital inpatient stays. But I'll keep that in mind for other similar reports which may have more granular data.

    Here's what I ended up doing.

    select count(VisitID) as Census,

    datepart(hh,dateadd(hh,n,'1/1/2005')) as CensusHour

    from tally

    left outer join Visits on dateadd(hh,n,'1/1/2005')

    between AdmitDateTime and coalesce(DischargeDateTime,getdate())

    where n < 30000

    group by datepart(hh,dateadd(hh,n,'1/1/2005'))

    order by datepart(hh,dateadd(hh,n,'1/1/2005'))

    My ultimate goal was to find the total number of patients who were currently admitted at the top of each hour of the day in order to run statistics that will help determine staffing levels. I slightly modified the suggested code from mrpolecat to include a DATEPART function and grouped the records by the hour of day in which they were open. I added the n<30000 to keep myself from needlessly checking for future records that wouldn't exist.

    Thanks to everyone for your help. My time spent reading these forums made me cringe at the idea of looping through the hours in a brute force manner, and the tally table suggestion is exactly the type of thing I was hoping to find.

    Next time I hope I'll be the one who can offer advice.

    Paul

  • I call mine "Numbers" because Robyn Page and Phil Factor used that in the article that got me using them. http://www.simple-talk.com/sql/t-sql-programming/the-helper-table-workbench/ has a bunch of good stuff on using these tables for things other than just string splitting.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply