replace chain of function calls with better code

  • We have a scalar function that is used to add a specific number of business days to a date however to dev who wrote it has gone overkill on the use of it and crashed the ETL - it now never finishes.

    We have approx 75k records in the table in question and depending on certian values the function can be called anywhere from 12 to 28 times per record (remeber 75k records).

    I need to refactor this code ASAP and I am wondering about the most performant way to achieve  it.

    I am thinking possibly using cross apply but unsure if it would gove any performace boost.

    Does anyone have any ideas?

     

    Cheers,

    Dave

  • Can you show us the code for the function?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Agreed, we need to see the code.

    And, do you have a table with non-workdays in it?  We'll need to know the DDL for that table to properly determine business days.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • To add to everyone else asking about the code, are you sure it is the function that "crashed the ETL - it now never finishes."?  I am wondering if MAYBE you have some blocking going on at the table level causing it to not finish.  Might not hurt to review the execution plan as well.

    Might not hurt to look at how that function is called... 12-28 times per record MIGHT make sense to have the data tossed into a persisted column rather than repeated calls to the function.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • david_h_edmonds wrote:

    We have a scalar function that is used to add a specific number of business days to a date however to dev who wrote it has gone overkill on the use of it and crashed the ETL - it now never finishes.

    We have approx 75k records in the table in question and depending on certian values the function can be called anywhere from 12 to 28 times per record (remeber 75k records).

    I need to refactor this code ASAP and I am wondering about the most performant way to achieve  it.

    I am thinking possibly using cross apply but unsure if it would gove any performace boost.

    Does anyone have any ideas?

    Cheers,

    Dave

    CROSS APPLY with a scalar function is like drinking to celebrate sobriety. 😀

    We're all asking to see the function because that will explain to us WHAT you need to do and we'll figure out a better way HOW to do it.

    If you're looking elsewhere for a solution and someone uses a recursive CTE to do it, forget it.  A While loop is faster than that.  Seriously... been there, done that.

    Also, let us know if you have a "Calendar" table and/or a "Holiday" table and, if you do, post the CREATE TABLE statement for the ones you have.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Morning all, sincere apologies for the radio silence.

    The function is as follows:

    create function 
    dbo.mydatefunction (@Date date, @DaysToAdd int)


    returns Date as
    begin

    declare @Dates table
    (
    Date Date,
    Holiday Int,
    Weekend Int
    )

    Insert into @Dates
    Select
    Date,
    Is_Holiday,
    Is_Weekend
    from
    (select
    Date,
    Is_Holiday,
    Is_Weekend,
    Row_Number() over (Order by Date Asc) as rn
    from
    dbo.D_Date
    where
    Date > @Date and Is_Holiday = 0 and Is_Weekend = 0
    ) as data
    where rn <= @DaysToAdd

    declare @ReturnDate Date = (Select MAX(Date) from @Dates);
    Return @ReturnDate
    End;

    Cheers,

    Dave

  • There are some very clever people in this forum who will certainly come up with great ideas to help you. But here is my first attempt at speeding things up.

    First, create a numbered VIEW over your date table:

    CREATE VIEW dbo.mydateview WITH SCHEMABINDING
    AS
    SELECT d.Date
    ,rn = ROW_NUMBER() OVER (ORDER BY d.Date ASC)
    FROM dbo.D_Date d
    WHERE d.Is_Holiday = 0
    AND d.Is_Weekend = 0;

    Next, change your function to be an iTVF which references this view. No need for any table variables nor inequalities in the WHERE clause:

    CREATE FUNCTION dbo.mydatefunction
    (
    @Date DATE
    ,@DaysToAdd INT
    )
    RETURNS TABLE
    AS
    RETURN
    (
    SELECT Date = LEAD(d.Date, @DaysToAdd) OVER (ORDER BY d.Date)
    FROM dbo.mydateview d
    WHERE d.Date = @Date
    );

     

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Make sure the D_Date table is uniquely clustered on Date (it should already be, but verify, just in case).  I'd stick with a scalar function for now, but get rid of all the wasted things being done in the code.

    SET ANSI_NULLS ON;
    SET QUOTED_IDENTIFIER ON;
    GO
    CREATE FUNCTION dbo.mydatefunction (
    @Date date,
    @DaysToAdd int
    )
    RETURNS date
    AS
    BEGIN
    RETURN (
    SELECT MAX(Date) AS Date
    FROM (
    SELECT TOP (@DaysToAdd) Date
    FROM dbo.D_Date
    WHERE Date > @Date AND
    Holiday = 0 AND
    Weekend = 0
    ORDER BY Date
    ) AS derived
    )
    /*end of function*/
    END

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • If you're using this function against multiple rows in a query, there's no way that I'd leave it as a scalar function.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I would definitely convert that scalar function to an inline-table valued function.  You don't actually need a separate view to implement @Phil's solution - which can be done using either a CTE or derived-table to number the rows.

    It would also help to see the code that is using this function - and determine why it would be called for multiple columns.  Do you really have 12 to 28 different dates in a single row where you need to calculate a future working date?

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I echo what the others are saying.  Can you provide some sample data as well as how you are determining the @DaysToAdd  parameter for each date?

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • This was removed by the editor as SPAM

  • The horrible truth is that our calendar system is very irregular in our definition of the business day is even worse. Trying to do some temporal calculations like this with the function will not work. I strongly recommend that you do a calendar table that includes an ordinal day and an ordinal business day. The DATE data type is small and you can put a few centuries in a relatively small table.

    Just as an exercise, assume that your company does not use Easter as a business day. Have you ever seen the computations required to compute Easter? But wait, it gets worse (I have a personal horror story on this one); which Easter? Catholic or Orthodox? Some years they match up. Some years they do not match up.. And sometimes business is called and postponed for things like, oh I don't know,say 9/11 or holidays set by decree.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • I very strongly prefer a separate "nonwork_days" table.  The table needs nothing only the date and a tinyint reason code for why that date is a nonwork day.

    The big problem with a calendar table being used to determine nonwork days is that people rely on a host of flags to determine work days, viz:

    SELECT COUNT(*) AS workdays_count FROM dbo.calendars WHERE is_weekend = 0 AND is_holiday = 0 ...

    Now you have code in dozens or hundreds or more places each determining work days.  Bad idea.  Even more critically, what happens then if you need to work on a weekend day for some reason?  It's almost impossible to do cleanly.  What if you need to work on a holiday because of an emergency? (such as COVID or a fire or flood).  Truly nearly impossible to do when code in hundreds of places is relying on "is_holiday" to control work logic.

    You naturally can use a calendar table for other purposes, such as determining fiscal year, etc.  But a basic calendar table should not be used for determining work days.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • I second the "nonwork_days" table, but I would make sure to design the reason code table well.

    You would want to make sure that the reason code table has a reason code identifier, a reason code description, and a reason code location.  The location may need to be broken up into multiple columns to handle country, state/province, and potentially city.  This will allow you to quickly and easily account for holidays in the USA vs CANADA vs JAPAN vs CHINA vs etc.

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply