Can anyone explain the design decision here?

  • I've been asked to take over maintenance of a system (the previous dev has left).  I've been looking at the code and I've found this function:

    ALTER FUNCTION [DP].[GetYears](@StartYear AS INT) RETURNS TABLE

    AS

    RETURN

    WITH
    L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
    L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
    L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
    L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
    L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
    L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
    Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5),
    StartOfMonth AS(
    SELECT
    DATEADD(YEAR,n-1, CAST(@StartYear as char(4)) + '01' + '01' ) AS StartDate
    FROM
    Nums
    WHERE
    n <= (1+year(getdate()) - @StartYear))

    SELECT
    SOM.StartDate,
    Year(SOM.StartDate) AS [Year]
    FROM
    StartOfMonth SOM;

    If you pass in 2017, it will return a table with the first date of the year in one column and the year in another column from 2017 to the current year.  But I have literally no idea of what the original designer had in mind with the L0...L5 and Nums CTEs.

    • This topic was modified 3 years, 9 months ago by  edwardwill.
  • It is an inline number/tally table which you can see if you SELECT * FROM Nums.

    Number/Tally tables are very useful and are well documented online.

    • This reply was modified 3 years, 9 months ago by  Ken McKelvey.
  • In general, agreed.  Although that specific implementation of an inline tally table is, frankly, horrible.  There's no need for anything remotely close to that number of rows (exceeding 2B for l5) for years, and the cte names are meaningless.

    Here's a more practical inline tally table for this function.


    WITH
    tally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number)
    ),
    tally1000 AS (
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1 AS number
    FROM tally10 c1 CROSS JOIN tally10 c2 CROSS JOIN tally10 c3
    ),
    StartOfMonth AS (
    ...

     

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • The place to start imo is with this very awesome article from Jeff.  I agree the code is not efficient.  The key is to have a 'row goal' in the form of SELECT TOP(n).  Here's an attempt to refactor

    drop function if exists dbo.fnTest_GetYears;
    go
    create function dbo.fnTest_GetYears(
    @StartYear int)
    returns table
    as
    return
    with
    n(n) as (select null
    from (values (1),(2),(3),(4)) n(n))
    select top(year(getdate())-@StartYear+2)
    dateadd(year,(row_number() over(order by (select null)))-1, datefromparts(@StartYear, 1, 1)) AS StartDate
    from n na, n nb, n nc, n nd, n ne, n nf, n ng, n nh,
    n ni, n nj, n nk, n nl, n nm, n np, n nq, n nr;
    go

    select * from dbo.fnTest_GetYears(2010);
    StartDate
    2010-01-01
    2011-01-01
    2012-01-01
    2013-01-01
    2014-01-01
    2015-01-01
    2016-01-01
    2017-01-01
    2018-01-01
    2019-01-01
    2020-01-01
    2021-01-01
    2022-01-01

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • ScottPletcher wrote:

    There's no need for anything remotely close to that number of rows (exceeding 2B for l5) for years

     

    But what if I need to know the start of every year since dinosaurs roamed the earth?

  • ZZartin wrote:

    ScottPletcher wrote:

    There's no need for anything remotely close to that number of rows (exceeding 2B for l5) for years

    But what if I need to know the start of every year since dinosaurs roamed the earth?

    Ah, good point I guess.  Maybe 'cause we are doing bios of Pelosi and Biden? 😉

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Yes plenty of time.  Lots of time.  Eons and eons.  More than Ten Centuries 🙂  Enough to fill BIGINT.  It could be dialed back as appropriate.

    • This reply was modified 3 years, 9 months ago by  Steve Collins.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • ScottPletcher wrote:

    ZZartin wrote:

    ScottPletcher wrote:

    There's no need for anything remotely close to that number of rows (exceeding 2B for l5) for years


    But what if I need to know the start of every year since dinosaurs roamed the earth?

    Ah, good point I guess.  Maybe 'cause we are doing bios of Pelosi and Biden? 😉

     

    Don't forget the tortoise that escaped the galapagos McConnell 😛

     

  • For cleaner code you can leverage fnTally. Since fnTally leverages TOP, you can exploit row goals as Steve Collins mentioned above.

    CREATE OR ALTER FUNCTION dbo.getYears(@year INT)
    RETURNS TABLE WITH SCHEMABINDING AS RETURN
    SELECT Yr = CAST(CONCAT(@year+tt.N,'0101') AS DATE)
    FROM dbo.fnTally(0, YEAR(GETDATE())-@year) AS tt;
    GO

    Itzik Ben-Gan is doing a series on the fastest number generator. Its worth a read since its a topic discussed in this thread.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • What about opportunity cost?  The tally is used here to gen years, so the number is inherently very small and will always be so.  It is really worth the time to ultra-tune something like that?  I don't think so.  I say spend the time analyzing your largest tables and their indexes, that would have a far better payback overall.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • edwardwill wrote:

    I've been asked to take over maintenance of a system (the previous dev has left).  I've been looking at the code and I've found this function:

    If you pass in 2017, it will return a table with the first date of the year in one column and the year in another column from 2017 to the current year.  But I have literally no idea of what the original designer had in mind with the L0...L5 and Nums CTEs.

    There are a couple of issues with that code...

    1. It'll be slower than necessary because of concatenation.
    2. It returns nothing if a year > the year of GETDATE() is provided.
    3. It's not documented, which is the reason for your original question and shouldn't be allowed to continue in such a fashion.

    Alan is correct that the use of fnTally can make things a whole lot easier to code.  The issues with his code is that it doesn't return both columns that your original code returned and it will also return the following error if @year has a value greater than the year found in GETDATE().

    Msg 3623, Level 16, State 1, Line 8

    An invalid floating point operation occurred.

    With that in mind, what do you want to be returned if @year has a value > the year of GETDATE()?

    p.s.  The error the code produces may actually be a very good thing because it will let you know you did something wrong with the data you passed to it.  I consider the nothing that the original code returned to be a "silent failure", which may or may not be a good thing and that's why I ask the question.

    • This reply was modified 3 years, 9 months ago by  Jeff Moden. Reason: Added the p.s. for clarification

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DECLARE @StartYear SMALLINT = 2017;

    SELECT DATETIMEFROMPARTS(@StartYear + number, 1, 1, 0, 0, 0, 0) AS StartDate,
    @StartYear + number AS Year
    FROM master.dbo.spt_values
    WHERE type = 'P'
    AND number <= DATEPART(YEAR, GETDATE()) - @StartYear;

    Since you take a startyear and calculate every year up to current year, you will probably never set a date that is more than 2,000 years before current year...


    N 56°04'39.16"
    E 12°55'05.25"

  • SwePeso wrote:

    DECLARE @StartYear SMALLINT = 2017;

    SELECT DATETIMEFROMPARTS(@StartYear + number, 1, 1, 0, 0, 0, 0) AS StartDate,
    @StartYear + number AS Year
    FROM master.dbo.spt_values
    WHERE type = 'P'
    AND number <= DATEPART(YEAR, GETDATE()) - @StartYear;

    Since you take a startyear and calculate every year up to current year, you will probably never set a date that is more than 2,000 years before current year...

    But don't reference the master db.  That's lazy.  Yes, I know MS did it in their code, but that doesn't make it a good idea.  For example, what if they later remove the spt_values table?  Or, much worse, remove some rows so that not every number is present?  Then you'd suddenly get incorrect results that would be very difficult to debug.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • The point is not using exactly that table, it could be another table. The point was to demonstrate the logic in the original query, in a easier fashion.


    N 56°04'39.16"
    E 12°55'05.25"

  • To generate a sequence why not use an explicit virtual table of the numbers themselves?   Here's a .txt file with a sql script and it defines a tvf called dbo.fn1k which takes a smallint as input and contains a virtual table from which it selects the top 0-1000 rows.  Total row count is 1,001.  It was generated as a rectangle of text using a spreadsheet.

    Then similar to Alan's code except I like DATEFROMPARTS so I use that.

    SELECT Yr = CAST(CONCAT(@year+tt.N,'0101') AS DATE) 
    FROM dbo.fnTally(0, YEAR(GETDATE())-@year) AS tt;

    with

    select yr = datefromparts(@year + f.n, 1, 1)
    from dbo.fn1k(year(getdate())-@year+2) f;

    Then it returns the same result.  Is it a good way?  Idk

    create or alter function dbo.getYears(@year smallint)
    returns table with schemabinding as return
    select yr = datefromparts(@year + f.n, 1, 1)
    from dbo.fn1k(year(getdate())-@year+2) f;
    go

    select * from dbo.getYears(2018);
    yr
    2018-01-01
    2019-01-01
    2020-01-01
    2021-01-01
    2022-01-01
    Attachments:
    You must be logged in to view attached files.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 15 posts - 1 through 15 (of 36 total)

You must be logged in to reply to this topic. Login to reply