What exactly are window functions and partitions?

  • Hi everyone,
    I've been reading about ranking functions in the T-SQL docs and I'm still unclear on the concepts of window functions and partitions in this context. This article suggests window functions are analytical functions, whereas I can't seem to be able to find a good explanation on partitioning as sectioning off a result set.
    Would you be as kind as to point me in the right direction?
    Thanks in advance.

  • You can think of some of the Windowing Aggregates as a "per row GROUP BY".  The object of the PARTITION BY is like the list of GROUP BY column(s).

    Here's an example of how they're used to simplify analysis compared to using GROUP BY and other things.  Then, let your imagination go wild with some of the things you can do with them.


    --===== If the test table already exists, drop it to make rerun in SSMS easier.
         If OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
       DROP TABLE #TestTable
    ;
    --===== Create and populate a test table on-the-fly.
     SELECT v.*
       INTO #TestTable
       FROM (VALUES
             (1,1),(1,2),(1,3),(1,4)
            ,(2,4),(2,4),(2,6),(2,6)
            )v(SomeID,SomeValue)
    ;
    --=====  For each row, calculate the percent of total and total by ID
         --  and calculate the overall rank and the rank by ID.
         --  The term "by ID" is an indication of where PARTITION BY should be used.
       WITH ctePreAggregate AS
    (
     SELECT  SomeID
            ,SomeValue
            ,GrandTotal = SUM(SomeValue) OVER ()
            ,TotalPerID = SUM(SomeValue) OVER (PARTITION BY SomeID)
       FROM #TestTable
    )
     SELECT  SomeID
            ,SomeValue
            ,GrandTotal
            ,PercentOfGrandTotal = SomeValue*1.0/GrandTotal
            ,OverallRank         = DENSE_RANK() OVER (ORDER BY (SomeValue*1.0/GrandTotal))
            ,TotalPerID
            ,PercentOfIdTotal    = SomeValue*1.0/TotalPerID
            ,RankById            = DENSE_RANK() OVER (PARTITION BY SomeID ORDER BY (SomeValue*1.0/GrandTotal))
       FROM ctePreAggregate
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ed Wagner - Sunday, January 14, 2018 8:37 PM

    Itzik Ben-Gan wrote a great reference on window functions and I'd recommend it to anyone.  It's at https://www.amazon.com/Microsoft-High-Performance-Functions-Developer-Reference/dp/0735658366.

    The book that Ed referenced is well worth buying,.. That said Itzik also as a 3 part series he did back in early 2012 that covers the basics really well...
    http://www.itprotoday.com/microsoft-sql-server/how-use-microsoft-sql-server-2012s-window-functions-part-1
    http://www.itprotoday.com/microsoft-sql-server/microsoft-sql-server-2012-how-write-t-sql-window-functions-part-2
    http://www.itprotoday.com/microsoft-sql-server/sql-server-2012-how-write-t-sql-window-functions-part-3

  • Thank you everyone!
    Your answers have been very helpful!

  • In your data you may have sales where State is a column, along with storeName. If you want want to rank your best selling stores you can sum sales and group by State and storeName. Where a windowed function would help you is to add another layer of context to the results you have now. If you want to pull your top three best stores in each state, a windowed function can easily be used. Similar if you want to see your three least performing stores per state/territory. It looks at your group you defined (state, store) as how to partition , and then takes a measure  like sum of sales to order by asc or descending.

    Another great way to add context to your results is have a column to list some aggregate across that partition (as I defined above). One such aggregate would be averageSalesAmountInState. So every row in that group will have the same value for this column. Which can then be used to easily see those stores with above/below average sales. Again ... context to your already grouped results. Think of it all as adding a form of meta data to your result set.

    ----------------------------------------------------

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply