Scalar function too slow - Calculated field in column based on another columns

  • I have a big query, (really big, 800+columns - can't really make it smaller) that I'm trying to build in the most efficient way.

    I have some fields like this:

     


    SELECT distinct(td.EVENT_ID),
    td.MENU_HINT,
    td.EVENT_NAME,
    td.EVENT_DT AS Expr1003,
    COUNT(td.EVENT_NAME) AS [Total Runners],
    -- AS [WinnerPrice],
    -- COUNT(td.ID) AS [WinnerCount],
    SUM(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS Sum1to2,
    count(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS count1to2,
    avg(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS average1to2,
    dbo.FN_SumWinnerRunners(td.EVENT_ID,34,55,0,7,12) AS [TEST FUNCTION],
    SUM(CASE WHEN (td.BSP >= 2 AND td.BSP < 3) AND td.WIN_LOSE =1 THEN td.BSP END) AS SumWinnerPrice2to3,
    count(CASE WHEN (td.BSP >= 2 AND td.BSP < 3) AND td.WIN_LOSE =1 THEN td.BSP END) AS CountWinnerPrice2to3
    FROM tblData td
    GROUP BY td.EVENT_ID,
    td.MENU_HINT,
    td.EVENT_NAME,
    td.EVENT_DT
    ORDER BY [Total Runners] DESC;

    And here a graphical one (click to enlarge):

    2020-03-15 17_17_26-Window

    if you notice, I've tried, as I have several queries that are fairly similar, to use a function (dbo.FN_SumWinnerRunners(td.EVENT_ID,34,55,0,7,12)

    the definition for it below:

    CREATE FUNCTION [dbo].[FN_SumWinnerRunners](
    @event_id INT,
    @BSPFrom SMALLINT,
    @BSPTo SMALLINT,
    @Win_Lose SMALLINT,
    @RunnersFrom SMALLINT,
    @RunnersTo SMALLINT)
    RETURNS FLOAT
    AS
    BEGIN
    DECLARE @result FLOAT=
    (
    SELECT SUM(CASE
    WHEN(td.BSP >= @BSPFrom
    AND td.BSP < @BSPTo)
    AND td.WIN_LOSE = @Win_Lose
    THEN td.BSP
    END)
    FROM tblData td
    WHERE td.EVENT_ID = @event_id
    HAVING COUNT(td.EVENT_NAME) > @RunnersFrom
    AND COUNT(td.EVENT_NAME) <= @RunnersTo
    );
    RETURN @result;
    END;
    GO

    the thing is that apparently the function is killing my performance, the column that uses the function take the query from less than a second, to 3 minutes. but at the same time I can't figure out how to do it without one, as I need to do a sum from event_name only in cases in which the count of event name fulfills the query.

    so my question would be what am I breaking from the function that is making it super slow

    and here some execution plans: the one from above is without the function, the one below with the function (click to enlarge):

    2020-03-15 17_17_47-Window

    I thought about doing it with CTE, but the thing is that I need one column for BSP between 1 and 3, one column for BSP between 3 and 5, one column for bsp between 5 and 8 (and up to a 1000) I know this is awful but is the requirement, I also need some other combinations such as other columns like BSP that will have ranges like these

    any help?

  • Sometimes spreadsheets are useful to concatenate together parameterized sql code.

    ="sum(case when (td.BSP >="&B7&" and td.BSP <"&C7&") and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice"&B7&"to"&C7
    ="count(case when (td.BSP >="&B8&" and td.BSP <"&C8&") and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice"&B8&"to"&C8
    sum(case when (td.BSP >=3 and td.BSP <4) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice3to434
    count(case when (td.BSP >=3 and td.BSP <4) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice3to434
    sum(case when (td.BSP >=4 and td.BSP <5) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice4to545
    count(case when (td.BSP >=4 and td.BSP <5) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice4to545
    sum(case when (td.BSP >=5 and td.BSP <6) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice5to656
    count(case when (td.BSP >=5 and td.BSP <6) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice5to656
    sum(case when (td.BSP >=6 and td.BSP <7) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice6to767
    count(case when (td.BSP >=6 and td.BSP <7) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice6to767
    sum(case when (td.BSP >=7 and td.BSP <8) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice7to878
    count(case when (td.BSP >=7 and td.BSP <8) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice7to878
    sum(case when (td.BSP >=8 and td.BSP <9) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice8to989
    count(case when (td.BSP >=8 and td.BSP <9) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice8to989
    sum(case when (td.BSP >=9 and td.BSP <10) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice9to10910
    count(case when (td.BSP >=9 and td.BSP <10) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice9to10910
    sum(case when (td.BSP >=10 and td.BSP <11) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice10to111011
    count(case when (td.BSP >=10 and td.BSP <11) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice10to111011
    sum(case when (td.BSP >=11 and td.BSP <12) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice11to121112
    count(case when (td.BSP >=11 and td.BSP <12) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice11to121112
    sum(case when (td.BSP >=12 and td.BSP <13) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice12to131213
    count(case when (td.BSP >=12 and td.BSP <13) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice12to131213
    sum(case when (td.BSP >=13 and td.BSP <14) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice13to141314
    count(case when (td.BSP >=13 and td.BSP <14) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice13to141314
    sum(case when (td.BSP >=14 and td.BSP <15) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice14to151415
    count(case when (td.BSP >=14 and td.BSP <15) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice14to151415
    sum(case when (td.BSP >=15 and td.BSP <16) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice15to161516
    count(case when (td.BSP >=15 and td.BSP <16) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice15to161516
    sum(case when (td.BSP >=16 and td.BSP <17) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice16to171617
    count(case when (td.BSP >=16 and td.BSP <17) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice16to171617
    sum(case when (td.BSP >=17 and td.BSP <18) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice17to181718
    count(case when (td.BSP >=17 and td.BSP <18) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice17to181718
    sum(case when (td.BSP >=18 and td.BSP <19) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice18to191819
    count(case when (td.BSP >=18 and td.BSP <19) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice18to191819
    sum(case when (td.BSP >=19 and td.BSP <20) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice19to201920
    count(case when (td.BSP >=19 and td.BSP <20) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice19to201920
    sum(case when (td.BSP >=20 and td.BSP <21) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice20to212021
    count(case when (td.BSP >=20 and td.BSP <21) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice20to212021
    sum(case when (td.BSP >=21 and td.BSP <22) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice21to222122
    count(case when (td.BSP >=21 and td.BSP <22) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice21to222122
    sum(case when (td.BSP >=22 and td.BSP <23) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice22to232223
    count(case when (td.BSP >=22 and td.BSP <23) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice22to232223
    sum(case when (td.BSP >=23 and td.BSP <24) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice23to242324
    count(case when (td.BSP >=23 and td.BSP <24) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice23to242324
    sum(case when (td.BSP >=24 and td.BSP <25) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice24to252425
    count(case when (td.BSP >=24 and td.BSP <25) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice24to252425
    sum(case when (td.BSP >=25 and td.BSP <26) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice25to262526
    count(case when (td.BSP >=25 and td.BSP <26) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice25to262526
    sum(case when (td.BSP >=26 and td.BSP <27) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice26to272627
    count(case when (td.BSP >=26 and td.BSP <27) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice26to272627
    sum(case when (td.BSP >=27 and td.BSP <28) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice27to282728
    count(case when (td.BSP >=27 and td.BSP <28) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice27to282728
    sum(case when (td.BSP >=28 and td.BSP <29) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice28to292829
    count(case when (td.BSP >=28 and td.BSP <29) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice28to292829
    sum(case when (td.BSP >=29 and td.BSP <30) and td.WIN_LOSE=1 then td.BSP end) as SumWinnerPrice29to302930
    count(case when (td.BSP >=29 and td.BSP <30) and td.WIN_LOSE=1 then td.BSP end) as CountWinnerPrice29to302930

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Each time you use the function it is doing a query on the table. So if you use the function 10 times it will do 10 queries on the table for each row that's returned!

  • and what do you recommend to do?  I was looking into using a cross apply, but I think it will be the same

     

  • Baldie47 wrote:

    and what do you recommend to do?  I was looking into using a cross apply, but I think it will be the same

    The query isn't appropriate for a computed column because of  what it does.  I'm pretty sure it can't be persisted, either.

    I'm thinking what you need is an Indexed View.  Either that or a good stored procedure to use for reporting purposes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • can a view receive parameters? no, right? I'm thinking about how would I do it if I need to for bringing all the columns I need, with all the different parameters, to at least try with a view

  • Baldie47 wrote:

    and what do you recommend to do?  I was looking into using a cross apply, but I think it will be the same 

    If you make it a table valued function and use it in a CROSS APPLY, it should be more efficient.

  • You cannot use Table Valued Functions in computed columns in a table, though.

    EDIT:  Sorry... I've obviously confused this with some other post I was working on.  The OP never mentioned computed columns on this thread.

    Jonathan is correct.  If you convert the function to an iTVF (inline table valued function) and use CROSS APPLY to reference it, things may speed up a fair bit.

    You can easily tell if your function is an iTVF or not... If it has the word BEGIN in it, it's n0t and iTVF.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm confused by your query. Why have you got distinct in "SELECT distinct(td.EVENT_ID)"?

    You have EVENT_ID in the group by. Distinct applies to the entire line, not just what you put in brackets after it. So it is entirely unnecessary/redundant.

  • yes, I'm now working on setting it up with ITVF.

    I have a small doubt about the query:

    ALTER FUNCTION [dbo].[FN_WinnerRunners](@event_id INT, 
    @BSPFrom SMALLINT,
    @BSPTo SMALLINT,
    @Win_Lose SMALLINT,
    @RunnersFrom SMALLINT,
    @RunnersTo SMALLINT)
    RETURNS TABLE
    AS
    RETURN
    (

    SELECT CASE WHEN (SELECT * FROM dbo.getCountRunners(144705336)) = 10 THEN
    (

    select SUM(CASE
    WHEN tdi.BSP >= 89
    AND tdi.BSP < 1000
    AND tdi.WIN_LOSE = 0
    THEN tdi.BSP
    ELSE 0
    END) AS [sum],
    SUM(CASE
    WHEN tdi.BSP >= 89
    AND tdi.BSP < 1000
    AND tdi.WIN_LOSE = 0
    THEN 1
    ELSE 0
    END) AS [count]
    FROM tblData tdi
    WHERE tdi.EVENT_ID = 144705336
    )
    end

    I',m trying to wrap the "select SUM(CASE

    WHEN tdi.BSP >= 89"  in a sum case when the result of "dbo.getCountRunners(144705336)) = 10" is 10.

     

    dbo.getCountRunners(144705336) is a function to count the amount of event_id that is passed to it (144705336)  in this case is 10.

    (later I'll pass variable to it) is there any way that I can do this and have it on the ITVF? as I need to perform the whole logic only if the count of even_it that is passed is 10 (there is a bit more of logic, but with this example works)

  • I think you will get better results using a CTE than writing any function with a query in it.

    This should be faster, but I think you could get it even faster if you put your mind to it, you could probably get all the results with a single table scan.

    DECLARE @BSPFrom     smallint=34,
    @BSPTo smallint=55,
    @Win_Lose smallint=0,
    @RunnersFrom smallint=7,
    @RunnersTo smallint=12

    ;WITH CTE AS
    (
    SELECT td.EVENT_ID AS EVENT_ID,
    SUM(CASE WHEN(td.BSP >= @BSPFrom AND td.BSP < @BSPTo) AND td.WIN_LOSE = @Win_Lose THEN td.BSP END) [TEST FUNCTION]
    FROM tblData td
    GROUP BY td.EVENT_ID
    HAVING COUNT(td.EVENT_NAME) > @RunnersFrom
    AND COUNT(td.EVENT_NAME) <= @RunnersTo
    )
    SELECT td.EVENT_ID,
    td.MENU_HINT,
    td.EVENT_NAME,
    td.EVENT_DT AS Expr1003,
    COUNT(td.EVENT_NAME) AS [Total Runners],
    -- AS [WinnerPrice],
    -- COUNT(td.ID) AS [WinnerCount],
    SUM(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS Sum1to2,
    COUNT(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS count1to2,
    AVG(CASE WHEN td.BSP >= 1 AND td.BSP < 2 THEN td.BSP END) AS average1to2,
    --dbo.FN_SumWinnerRunners(td.EVENT_ID, 34, 55, 0, 7, 12) AS [TEST FUNCTION],
    MAX(x.[TEST FUNCTION]) [TEST FUNCTION],
    SUM(CASE WHEN(td.BSP >= 2 AND td.BSP < 3) AND td.WIN_LOSE = 1 THEN td.BSP END) AS SumWinnerPrice2to3,
    COUNT(CASE WHEN(td.BSP >= 2 AND td.BSP < 3) AND td.WIN_LOSE = 1 THEN td.BSP END) AS CountWinnerPrice2to3
    FROM tblData td
    OUTER APPLY(SELECT [TEST FUNCTION] FROM CTE WHERE CTE.EVENT_ID = td.EVENT_ID) x
    GROUP BY td.EVENT_ID, td.MENU_HINT, td.EVENT_NAME, td.EVENT_DT
    ORDER BY [Total Runners] DESC;
  • Baldie47 wrote:

    I need one column for BSP between 1 and 3, one column for BSP between 3 and 5, one column for bsp between 5 and 8 (and up to a 1000) I know this is awful but is the requirement, I also need some other combinations such as other columns like BSP that will have ranges like these...

    Is there a consistent rule for the size of the buckets?

    between 1 and 3 (count 3)

    between 3 and 5 (count 3)

    between 5 and 8 (count 4)

    If the pattern is consistent then the bucket ranges could be calculated as columns in a GROUP BY and then PIVOT (or a WHILE loop) could be applied to the results to create the needed columns.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • I would think you could summarize the data just once, for all BSP ranges and WIN/LOSS combos, then join to that, if needed, to get the final result.  I can only give you general guideline coed, since there aren't enough details in your posts to go further, but something like this should be close.


    IF OBJECT_ID('tempdb.dbo.#summary') IS NOT NULL
    DROP TABLE #summary

    CREATE TABLE dbo.BSP_ranges (
    BSPFrom smallint NOT NULL,
    BSPTo smallint NOT NULL
    CONSTRAINT BSP_ranges__CK_BSPTo CHECK(BSPTo > BSPFrom),
    Win_Lose smallint NOT NULL
    )
    INSERT INTO dbo.BSP_ranges VALUES
    (1, 3, 0), (3, 5, 0), (5, 8, 0),
    /*...*/
    (34, 55, 0) /*, ...*/
    /* add additional logic to verify that the ranges are contiguous,
    at least I think that's your data requirement */
    INSERT INTO dbo.BSP_ranges
    SELECT BSPFROM, BSPTo, 1 AS Win_Lose
    FROM dbo.BSP_ranges

    SELECT TOP (0)
    td.EVENT_ID, br.BSPFrom, br.BSPTo,
    SUM(td.BSP) AS BSP_sum, COUNT(*) AS runners_count
    INTO #summary
    FROM dbo.tblData td
    INNER JOIN dbo.BSP_ranges br ON
    td.BSP >= br.BSPFrom AND td.BSP < br.BSPTo AND
    td.WIN_LOSE = br.Win_Lose
    WHERE td.EVENT_ID = @EVENT_ID
    GROUP BY td.EVENT_ID, br.BSPFrom, br.BSPTo

    CREATE CLUSTERED INDEX summary__CL ON #summary ( EVENT_ID, BSPFROM, BSPTo ) WITH ( FILLFACTOR = 100 );

    INSERT INTO #summary
    SELECT
    td.EVENT_ID, br.BSPFrom, br.BSPTo,
    SUM(td.BSP) AS BSP_sum, COUNT(*) AS runners_count
    FROM dbo.tblData td
    INNER JOIN dbo.BSP_ranges br ON
    td.BSP >= br.BSPFrom AND td.BSP < br.BSPTo AND
    td.WIN_LOSE = br.Win_Lose
    WHERE td.EVENT_ID = @EVENT_ID
    GROUP BY td.EVENT_ID, br.BSPFrom, br.BSPTo

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply