Calendar table to transaction table join performance

  • Greetings, SQL Experts,
    I encountered a performance issue that caught me by surprise. I'm joining two tables. One is calendar table with one row per day, and an indexed datetime type DateKey column. The other is  a transaction table with a TransactionTime column that is unindexed smalldatetime. If it matters, the granularity of the TransactionTime column is minutes - the :ss portion of the smalldatetime field is always :00. 

    This query runs in a fraction of a second:

    Declare @startdate smalldatetime
    set @startdate = '2017-01-01'

    Select    [Date] = b.DateKey
            , [Plan Week] = b.PlanningWeekNo
            , [Plan Day of Week] = b.PlanningDayOfWeekNo
            , Transactions = count(a.TransactionKey)
    from MyDB..Transactions a
        join MyDB..Calendar b
        on cast(convert(varchar, a.TransactionTime, 101) as smalldatetime) = b.DateKey --trim hh:mm:ss from TransactionTime and compare to DateKey
        and a.TransactionTime >= @startdate --for transactions on or after startdate
    group by DateKey
            , b.PlanningWeekNo
            , b.PlanningDayOfWeekNo

    However, if I move the date filter to the calendar table like so: 
    Declare @startdate smalldatetime
    set @startdate = '2017-01-01'

    Select    [Date] = b.DateKey
            , [Plan Week] = b.PlanningWeekNo
            , [Plan Day of Week] = b.PlanningDayOfWeekNo
            , Transactions = count(a.TransactionKey)
    from MyDB..Transactions a
        join MyDB..Calendar b 
        on cast(convert(varchar, a.TransactionTime, 101) as smalldatetime) = b.DateKey --trim hh:mm:ss from TransactionTime and compare to DateKey
        and b.DateKey >= @startdate --for dates after startdate
    group by DateKey
            , b.PlanningWeekNo
            , b.PlanningDayOfWeekNo

    It takes about 18 seconds to run. Huh? I was really expecting the second one to perform as well or better, because the filter condition is applied to fewer rows in the calendar table.

    Looking forward to your insight. Thanks!

    G

  • Did you compare the execution plans of the two different queries?

    Sue

  • George W (no relation) - Thursday, November 30, 2017 4:17 PM

    joining on a column wrapped in CAST? Why would you do that?? That means the entire column must be converted! Create a new column that matches the datatype of the column in the other table, and index it, then do the join.

  • Looks like you are using SQL Server 2016, try this:
    DECLARE @startdate SMALLDATETIME;
    SET @startdate = '2017-01-01';

    SELECT
      [Date]               = .[DateKey]
      , [Plan Week]        = .[PlanningWeekNo]
      , [Plan Day of Week] = .[PlanningDayOfWeekNo]
      , [Transactions]     = COUNT([a].[TransactionKey])
    FROM
      [MyDB].[dbo].[Transactions]  [a] --SPECIFY YOUR SCHEMA!
      INNER JOIN [MyDB].[dbo].[Calendar] --SPECIFY YOUR SCHEMA!
        ON CAST([a].[TransactionTime] AS DATE) = .[DateKey] --trim hh:mm:ss from TransactionTime and compare to DateKey
           AND [a].[TransactionTime]          >= @startdate --for transactions on or after startdate
    GROUP BY
      [DateKey]
      , .[PlanningWeekNo]
      , .[PlanningDayOfWeekNo];

  • And this is the other query:
    DECLARE @startdate SMALLDATETIME;
    SET @startdate = '2017-01-01';

    SELECT
      [Date]               = .[DateKey]
      , [Plan Week]        = .[PlanningWeekNo]
      , [Plan Day of Week] = .[PlanningDayOfWeekNo]
      , [Transactions]     = COUNT([a].[TransactionKey])
    FROM
      [MyDB].[dbo].[Transactions]  [a]
      JOIN [MyDB].[dbo].[Calendar]
        ON CAST([a].[TransactionTime] AS DATE) = .[DateKey] --trim hh:mm:ss from TransactionTime and compare to DateKey
           AND .[DateKey]                  >= @startdate --for dates after startdate
    GROUP BY
      [DateKey]
      , .[PlanningWeekNo]
      , .[PlanningDayOfWeekNo];

    Would be interesting to see the actual execution plans of both. You can post them as .sqlplan files.

  • Make that join SARGable:

    --However, if I move the date filter to the calendar table like so:

    Declare @startdate smalldatetime

    set @startdate = '2017-01-01'

    SELECT [Date] = b.DateKey

    , [Plan Week] = b.PlanningWeekNo

    , [Plan Day of Week] = b.PlanningDayOfWeekNo

    , Transactions = count(a.TransactionKey)

    FROM MyDB..Transactions a

    JOIN MyDB..Calendar b

    ON CAST(a.TransactionTime AS DATE) = b.DateKey --trim hh:mm:ss from TransactionTime and compare to DateKey

    WHERE b.DateKey >= @startdate --for dates after startdate

    GROUP BY DateKey

    , b.PlanningWeekNo

    , b.PlanningDayOfWeekNo

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • try filtering in a WHERE not in the join?
    also, check for implicit conversion?

    ChrisM@Work + 1 for SARGable!

  • It can't hurt to apply the limiting condition to both tables:

    WHERE a.TransactionTime >= @startdate
    and b.DateKey >= @startdate
    and ...

    Maybe since the DateKey is datetime, we should cast to datetime rather than date, although SQL 2016 may compensate for that anyway and still be able to treat it as sargable.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Friday, December 1, 2017 7:46 AM

    It can't hurt to apply the limiting condition to both tables:

    WHERE a.TransactionTime >= @startdate
    and b.DateKey >= @startdate
    and ...

    Maybe since the DateKey is datetime, we should cast to datetime rather than date, although SQL 2016 may compensate for that anyway and still be able to treat it as sargable.

    This is SARGable as far back as SQL 2012, at least:

    CAST(a.TransactionTime AS DATE) = b.DateKey --trim hh:mm:ss from TransactionTime and compare to DateKey

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks to everyone who responded. Great info. Unfortunately, I apparently don't have "Show Plan" permissions on the Transaction table at the moment, or I would post the plan. But Lynn, Scott, ChrisM, and anyone else who suggested sargability was the issue seem to have the answer. Using "CAST(a.TransactionTime AS DATE)" in place of my original  "cast(convert(varchar, a.TransactionTime, 101) as smalldatetime)"  eliminated the difference in performance between the two queries. 

    pietlinden, I don't have permission to alter these tables. Fortunately, "CAST" meets my performance needs in this case, allowing me to group 150,000 transaction rows by week in under 1 second. 
    Thanks again for your expertise!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply