Querying rows by Date with StartDate and EndDate fields (aka What would Jeff Moden do?)

  • I'm dealing with healthcare data where a table maintains a list of patient_ids, associated provider_ids, and relationship StartDate/EndDates. Therefore, each row reflects how long a given patient has been associated with a physician.

    When I query the table for a given date, I basically use a query that looks as follows:

    Select PatientID, ProviderId

    From tblPatientProvider

    Where @Date between StartDate and EndDate

    When I join tblPatientProvider to several other tables, joined by PatientID & ProviderID, I find myself wondering if query performance suffers from the "@Date between StartDate and EndDate" clause.

    Even though the data in tblPatientProvider contains separate StartDate/EndDate fields, I keep thinking that maybe there's a better way to query the data.

    Jeff Moden's various articles/posts on SSC warn against hidden triangular joins, RBAR and such... So, is the above example query "OK" or is there a way to optimize it?

    In general, performance is satisfactory, but I wouldn't mind ensuring that it won't hit the wall when several millions of rows are involved.

    Thanks in advance,

    Pete

  • WWJMD?

    Jeff would probably drive the moneylenders from the temple with pork chops flung freely.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • You are not wrapping the date columns themselves in a function, which is the usual performance killer.

    Using BETWEEN with dates can be perilous, for example if you have a @date of 1/10/2009 and date column contains date and time, then you will miss all the rows with a time greater than midnight on that date. (See below)

    Adjust @date (not the columns) to make sure you are prepared for that eventuality, even if it means abandoning BETWEEN in favor of startdate <= @date1 and endDate >= @date2.

    -- @date not BETWEEN

    select cast('1/10/2009' as datetime) as [@date]

    ,dateadd(second,1,cast('1/10/2009' as datetime)) as dateCol

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Heh... don't forget the false physicians. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Bob, thanks for the heads up on the date Between gotcha. Fortunately, we've got that covered -- we store all clinical related dates (e.g., admission, discharge, date of diagnosis) as YYYYMMDD -- no need to know the exact time for those situations.

    Again, the reason for my post is merely to hear feedback from good folks like you who might see a better way to code.

    ("WWJMD?" -- You're abbreviation makes for a big laugh! Speak of the devil, nice to see he paid a visit 😉

    --Pete

  • Dave, thanks for the link to the article on uses/abuses of datetime querying. The author certainly took the time to investigate a myriad of pitfalls. Surprisingly timely!

    --Pete

  • peterzeke (10/20/2009)


    Bob, thanks for the heads up on the date Between gotcha. Fortunately, we've got that covered -- we store all clinical related dates (e.g., admission, discharge, date of diagnosis) as YYYYMMDD -- no need to know the exact time for those situations.

    Again, the reason for my post is merely to hear feedback from good folks like you who might see a better way to code.

    ("WWJMD?" -- You're abbreviation makes for a big laugh! Speak of the devil, nice to see he paid a visit 😉

    --Pete

    Hold on - you are storing dates as characters? Is the data type of the column datetime or char(8)? If char(8) - and always formatted as YYYYMMDD, then I would make sure your variable @date is also char(8). Using between here would not be a problem...

    Either way - make sure your variable and columns are of the same data type or you will end up with implicit conversions that could end up blocking you from utilizing any indexes on those columns.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply