The Growth of T-SQL

  • It's difficult to produce such a list of new functionality even when restricted to just those for T-SQL.

    One of my OMG! favorite improvements was when the came out with STRING_AGG() in 2017. I do a whole lot of testing and that has made it super easy to create multi-segment test examples.

    That and, as you mentioned, the proper operation of OVER() for aggregate functions.  That made running totals and "data smears" easily possible and in a reasonable performant manner that works well enough to no longer need the "Quirky Update" to do such things (although there have been some places where I still used it because it is STILL faster).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Comments posted to this topic are about the item The Growth of T-SQL

  • There are also many T-SQL language enhancements introduced in SQL Server 2022 (including Azure SQL Database) worthy of mentioning. Below are some from the SQL Server 2022 What's new page. DATETRUNC(), GENERATE_SERIES(), and SELECT ... WINDOW are the ones I find myself using the most so far.

    New T-SQL clauses:

    - SELECT ... WINDOW

    - IS [NOT] DISTINCT FROM

    New time series functions:

    - DATE_BUCKET ()

    - GENERATE_SERIES ()

    Enhanced Window functions:

    - FIRST_VALUE ()

    - LAST_VALUE ()

    New/enhanced JSON functions:

    - ISJSON ()

    - JSON_PATH_EXISTS ()

    - JSON_OBJECT ()

    - JSON_ARRAY ()

    New aggregate functions:

    - APPROX_PERCENTILE_CONT ()

    - APPROX_PERCENTILE_DISC ()

    New/enhanced T-SQL functions:

    - GREATEST ()

    - LEAST ()

    - STRING_SPLIT ()

    - DATETRUNC ()

    - LTRIM ()

    - RTRIM ()

    - TRIM ()

    New bit manipulation functions:

    - LEFT_SHIFT ()

    - RIGHT_SHIFT ()

    - BIT_COUNT ()

    - GET_BIT ()

    - SET_BIT ()

     

  • I started putting together a list of my favorite T-SQL changes by SQL Server version and found myself wandering deep into the rabbit hole.

    SQL Server 2005

    This is when SQL Server become less of a Scarcely Qualifies as a Language and more of a real enterprise level RDBMS. The game changers for me were:

    • The APPLY table operator
    • Window Functions
    • CLR (which are seldomly faster than the T-SQL way of doing the same thing, but have solved a few tricky issues for me over the years)
    • PIVOT/UNPIVOT also gets an honorable mention, though APPLY can be used for pivoting/unpivoting

    SQL Server 2008

    No game changers but I do like:

    • The VALUES table constructor (SELECT v.SomeThing FROM (VALUES(1),(2),(3)) AS v(SomeThing))
    • GROUPING SETS, ROLLUP, CUBE
    • MERGE

    SQL Server 2012

    2012 was the next big leap forward for T-SQL window functions (so much so that Itzik Ben-Gan wrote a book about them):

    • Window Function framing - Rows/Range
    • FIRST_VALUE, LAST_VALUE, OFFSET FETCH
    • LAG & LEAD

    Other notable mentions include: CONCAT, IIF and EOMONTH

    2012 did include FORMAT, however, which was a step backward. CHOOSE is cool and allows for cleaner code but I don't use it as, no matter what I do, I get an implicit conversion in the execution plan.

    SQL Server 2014

    Nothing special to see here. In-memory objects and columnstore indexes are game changers but don't count as "T-SQL enhancements.

    SQL Server 2016

    The game changer here was STRING_SPLIT. It's nasty fast but did not include an ordinal until 2022. I only used/use STRING_SPLIT for SSRS multi-select parameters passed as strings (where order doesn't matter.) When order matters a tally table splitter does the trick.  DROP IF EXISTs is more of a DDL thing but this was long overdue. STRING_ESCAPE is cool, but I never need.

    SQL Server 2017

    No game changers here but:

    STRING_AGG is cleaner than the FOR XML PATH concatenation trick but, under the hood, it's XML PATH with better handling of reserved XML characters.

    TRIM is cool, it's not just RTRIM+LTRIM as most people think, you can do a lot with it

    DDL - CREATE OR ALTER. This was also long overdue.

    My favorite 2017 new t-SQL was TRANSLATE; TRANSLATE is a straight up gangster function that doesn't get the respect it deserves. I'll expand on this in a moment.

    SQL Server 2019

    Nothing to see here. APPROX_COUNT_DISTINCT? Yahooooo!

    SQL Server 2022

    This guy includes two game changers:

    GENERATE_SERIES (AKA Python RANGE) is a tally table on mild steroids. it's the fastest way to count in SQL Server but and returns an ordered set. This is big. Sadly, there are bugs; more on that in a moment.

    STRING_SPLIT with an ordinal. A correctly developed built-in splitter. And it only took 25 years.

    Not game changers a couple personal favorites:

    • GREATEST and LEAST
    • The WINDOW (alias) clause

    More about TRANSLATE

    TRANSLATE is often confused with a nested replace, but it's not. Let's say I need to transform a DNA sequence to its compliment. I need A's to become T's, T's to become A's, G's to become C's and C's to become G's. For example, AAGGGGTGACTCTAGTTTAATATA needs to be TTCCCCACTGAGATCAAATTATAT. Here's a few attempts using REPLACE:

    DECLARE @SomeDNA VARCHAR(100) = 'AAGGGGTGACTCTAGTTTAATATA'

    SELECT REPLACE(REPLACE(REPLACE(REPLACE(@SomeDNA,'A','T'),'C','G'),'T','A'),'G','C');
    SELECT REPLACE(REPLACE(REPLACE(REPLACE(@SomeDNA,'A','T'),'T','A'),'C','G'),'G','C');
    SELECT REPLACE(REPLACE(REPLACE(REPLACE(@SomeDNA,'G','C'),'T','A'),'C','G'),'A','T');

    Each query returns a different result and they are all wrong. TRANSLATE to the rescue:

    DECLARE @SomeDNA VARCHAR(100) = 'AAGGGGTGACTCTAGTTTAATATA'

    SELECT TRANSLATE(@SomeDNA,'ATCG','TAGC');
    SELECT TRANSLATE(@SomeDNA,'GATC','CTAG');
    SELECT TRANSLATE(@SomeDNA,'CGAT','GCTA');

    Each of these, as expected, returns the correct result. Now if you're not into DNA transformations you can use TRANSLATE for string cleaning. Here I'm removing non-numeric characters  from a unique identifier.

    DECLARE @Somestring VARCHAR(36) = NEWID();

    SELECT REPLACE(TRANSLATE(@SomeString,'ABCDEF-',' '),' ','');

    GENERATE_SERIES

    based on my testing this guy is at least twice as fast as anything I've seen (e.g. fn_tally, dbo.GetNumsAlanCharlieItzikBatch) ...But there are a few glitches that need fixing. For example look what happens with window aggregate functions and variables; these should both return the same value but don't:

    DECLARE @low INT = 1, @high INT = 10;

    SELECT
    N = t.[value],
    SO = SUM(t.[value]) OVER (ORDER BY t.[value])
    FROM GENERATE_SERIES(10,1) AS t;

    SELECT
    N = t.[value],
    SO = SUM(t.[value]) OVER (ORDER BY t.[value])
    FROM GENERATE_SERIES(@high,@low) AS t;

    Without Variables:

    N           SO
    ----------- -----------
    1 1
    2 3
    3 6
    4 10
    5 15
    6 21
    7 28
    8 36
    9 45
    10 55

    With variables:

    N           SO
    ----------- -----------
    10 55

    There are a few more weird things like this I hope get fixed. GENERATE_SERIES is a game changer nonetheless.

    • This reply was modified 1 year, 10 months ago by  Bernie156.

    I'm not the guy

  • I just posted the above post twice because the first post was:

    This was removed by the editor as SPAM

    You all need to pull it together man!

    I'm not the guy

  • Bernie156 wrote:

    I just posted the above post twice because the first post was:

    This was removed by the editor as SPAM

    You all need to pull it together man!

    Thanks for the list. With the links in there and your low post count, the automated system likely saw this as potential SPAM. Always tricky to decide how to handle things like this. I've unmarked the first one and deleted the second to help train the system better.

    If you want to write an article on how you use the game changers, for 2022 or any version, we'd love to see some short, practical looks at how people implement new functions in code.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply