The SQL Server LAG Function

  • Comments posted to this topic are about the item The SQL Server LAG Function

    Mike Byrd

  • Nice read, Mike.   Short, clear, and concise.    Supported by execution plans and time/io stats.    This is what I like to see even though I've used lag() for years.   Good job!!

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Thanks very much for the article. The performance differences captured and quite impressive.

    Wondering whether we can "LAG to more than one row behind" (haven't looked at the syntax though).

    Br. Kenneth Igiri
    https://kennethigiri.com
    All nations come to my light, all kings to the brightness of my rising

  • Yes, the lag function can offset more than row.  Syntax is (from Microsoft):
    LAG (scalar_expression [,offset] [,default]) 
        OVER ( [ partition_by_clause ] order_by_clause )

    Arguments

    scalar_expression
    The value to be returned based on the specified offset. It is an expression of any type that returns a single (scalar) value. scalar_expression cannot be an analytic function.

    offset
    The number of rows back from the current row from which to obtain a value. If not specified, the default is 1. offset can be a column, subquery, or other expression that evaluates to a positive integer or can be implicitly converted to bigint. offset cannot be a negative value or an analytic function.

    default
    The value to return when scalar_expression at offset is NULL. If a default value is not specified, NULL is returned. default can be a column, subquery, or other expression, but it cannot be an analytic function. default must be type-compatible with scalar_expression.

    OVER ( [ partition_by_clause ] order_by_clause)
    partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the order of the data before the function is applied. If partition_by_clause is specified, it determines the order of the data in the partition. The order_by_clause is required. For more information, see OVER Clause

    Mike Byrd

  • Mike Byrd - Monday, January 14, 2019 3:02 PM

    Yes, the lag function can offset more than row.  Syntax is (from Microsoft):
    LAG (scalar_expression [,offset] [,default]) 
        OVER ( [ partition_by_clause ] order_by_clause )

    And the syntax specified here is WRONG.  The correct syntax is
    LAG (scalar_expression [,offset [,default]]) 
        OVER ( [ partition_by_clause ] order_by_clause )

    The syntax on MS's site says that the offset and default are independent of each other, when they are not.  You can only specify the default if you have also specified the offset.

    Drew

    PS: If you having trouble seeing the difference, I moved the bracket after 'offset".

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Mike Byrd - Monday, January 14, 2019 3:02 PM

    Yes, the lag function can offset more than row.  Syntax is (from Microsoft):
    LAG (scalar_expression [,offset] [,default]) 
        OVER ( [ partition_by_clause ] order_by_clause )

    Arguments

    scalar_expression
    The value to be returned based on the specified offset. It is an expression of any type that returns a single (scalar) value. scalar_expression cannot be an analytic function.

    offset
    The number of rows back from the current row from which to obtain a value. If not specified, the default is 1. offset can be a column, subquery, or other expression that evaluates to a positive integer or can be implicitly converted to bigint. offset cannot be a negative value or an analytic function.

    default
    The value to return when scalar_expression at offset is NULL. If a default value is not specified, NULL is returned. default can be a column, subquery, or other expression, but it cannot be an analytic function. default must be type-compatible with scalar_expression.

    OVER ( [ partition_by_clause ] order_by_clause)
    partition_by_clause divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group. order_by_clause determines the order of the data before the function is applied. If partition_by_clause is specified, it determines the order of the data in the partition. The order_by_clause is required. For more information, see OVER Clause

    Thanks a lot, Mike. Will definitely explore it more.

    Br. Kenneth Igiri
    https://kennethigiri.com
    All nations come to my light, all kings to the brightness of my rising

  • this I have to investigate more thoroughly. I have a number of subqueries in a long running query where I am trying to retrieve the first, the last or the most recent. Your article with the link has led me to several other window functions that could be my solution to speed things up.

    Thank you thank you thank you.

    Luther

  • Useful article, thanks.

    • This reply was modified 5 years, 8 months ago by  Andy Hogg.
    • This reply was modified 4 years, 12 months ago by  Andy Hogg.
    • This reply was modified 4 years, 12 months ago by  Andy Hogg.
    • This reply was modified 4 years, 12 months ago by  Andy Hogg.
    • This reply was modified 4 years, 12 months ago by  Andy Hogg.
    • This reply was modified 4 years, 12 months ago by  Andy Hogg.
  • Thanks for taking the time to write a nice simple article on the subject of LAG.

    Shifting gears as a bit of a suggestion...
    While a graphical representation of code sure is pretty, it can't be copied and pasted as code to test. 
    You might want to consider sending the code to Steve so that he can attach it as a file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great article - nice and concise as others have noted. 
    We are using the Lag operator a lot in our work and find it not only improves performance, but readability as well once you are used to the syntax.

    Thanks

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply