Opinions on handling date ranges with unknown limits?

  • In my system there are a lot of programs for people to belong to, a lot of periods of information, a good number of StartDate/EndDate.

    Right now a period record always starts when the StartDate (the date they begin a program, the start of a period of time, etc) and in most cases we don't know when this period ends.

    In this case we're using NULL to represent that this data is unknown. This has the following quirk:

    WHERE GETDATE() BETWEEN StartDate AND ISNULL(EndDate,'9/9/9999')

    OR

    WHERE ((GETDATE() >= StartDate) AND ((EndDate IS NULL) OR (GetDate() <= EndDate))

    We always have to account for a null value with an absurd out of range date.

    I've seen people just use that absurd out of range date as a default value. If you see 9/9/9999, you know that it's not happened yet... unless you're checking on a period status as of 9/9/9999. This eliminates the need for an ISNULL or complex enddate logic.

    In many cases I know the values will at the latest be as of a date, but likely they won't go that far. If I default that date you would just need to change the date if they leave a period sooner, but you honestly wouldn't know if that date was chosen or if a glitch caused it to remain. I'm not a fan of valid defaults that people can change if they like for that reason

    I know it's not exactly good DB design to be defaulting to fake data, but I'm not sure that leaving it null is always a better solution. Seems cleaner from a data standpoint, but not from a usability standpoint.

    Opinions are nice, if anyone has any performance/ease of use advice for this, then that would be great.

  • ShawnTherrien (5/30/2011)


    In my system there are a lot of programs for people to belong to, a lot of periods of information, a good number of StartDate/EndDate.

    Right now a period record always starts when the StartDate (the date they begin a program, the start of a period of time, etc) and in most cases we don't know when this period ends.

    In this case we're using NULL to represent that this data is unknown. This has the following quirk:

    WHERE GETDATE() BETWEEN StartDate AND ISNULL(EndDate,'9/9/9999')

    OR

    WHERE ((GETDATE() >= StartDate) AND ((EndDate IS NULL) OR (GetDate() <= EndDate))

    We always have to account for a null value with an absurd out of range date.

    I've seen people just use that absurd out of range date as a default value. If you see 9/9/9999, you know that it's not happened yet... unless you're checking on a period status as of 9/9/9999. This eliminates the need for an ISNULL or complex enddate logic.

    In many cases I know the values will at the latest be as of a date, but likely they won't go that far. If I default that date you would just need to change the date if they leave a period sooner, but you honestly wouldn't know if that date was chosen or if a glitch caused it to remain. I'm not a fan of valid defaults that people can change if they like for that reason

    I know it's not exactly good DB design to be defaulting to fake data, but I'm not sure that leaving it null is always a better solution. Seems cleaner from a data standpoint, but not from a usability standpoint.

    Opinions are nice, if anyone has any performance/ease of use advice for this, then that would be great.

    A "null" is not an actual value but a marker telling that the value of such attribute is either unknown or not yet assigned.

    In the particular case of EndDate, when value is not yet assigned I prefer to set by default a "custom made null marker" e.g. 12/31/9999 which will tell me EndDate value is not yet assigned for that particular row and will also easy the pain of querying for StartDate/EndDate ranges.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I'm not a fan of sentinel values for a lot of the reasons you mentioned. What about having a status column, along the lines of PeriodIsEnded, or something broader like PeriodStatus. This will allow you to infer that your end date is null. The column could participate in indexes and be useful in any predicates where today you're wrapping your end date column in a function call.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • PS The column does not have to be persisted either. A computed column would keep the footprint of the table down and still qualify for inclusion in an index.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply