Can I make this any faster

  • Here is the function that I normally use for demiliting.

    CREATE FUNCTION dbo.udfStringtoTable

    (

    @String NVARCHAR(100) ,

    @Delimiter CHAR(1)

    )

    RETURNS TABLE

    RETURN (

    SELECT SUBSTRING(@String+@Delimiter, n,

    CHARINDEX(@Delimiter, @String+@Delimiter, n) - n) as [String]

    FROM tally

    WHERE n <= LEN(@String)

    AND SUBSTRING(@Delimiter + @String,

    n, 1) = @Delimiter

    )

    GO

    it uses a tally table.

    Hope this helps

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • One thing worth noting, those costs are estimates, even in the actual execution plan. They can accurately portray which parts of the query are most costly, but not necessarily. For example, in the case of multi-statement UDF's, since they have no statistics, they're treated as a single row by the optimizer. The cost for manipulating a single row is pretty low, but in actuality there may be more rows, lots more, and the cost then is not reflected in the execution plan.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi Chris,

    thanks for your code. Where does this tally table come into play? My variables for the delimiters come in from a drop down box which comes from reporting services.

    Thanks

  • hi,

    the function I am using does something like this

    http://www.sqlservercentral.com/scripts/Miscellaneous/31913/

    actually it was copied from this site. How can I put an index against this table then to prevent the table scans.

    Chris

  • The table scan itself may not be bad if it is small but these aren't really table scans but rather multiple executions of that function. Hard to say if that function is quick or slow but it would be good to test. Typically when we are doing something like what you are doing we will execute the function at the beginning and put the output into a #temp table and use that in the joins within the main part of the query. If the #temp table is large then you might want to index that for further performance. All depends on how many reads are coming against it.

    This may not be the problem but I would like to see if that helps.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • I just got done doing one like this a coupe of weeks ago. The problem is the aggregation... you're trying to aggregate in the presence of way too many joins. You need to preaggregate the data from the Temp_Glbbm table and include the columns used in the joins as part of both the Select List and Group By. Be sure to include the "filters" for that table that you currently have in the WHERE clause. The target of your preaggregation should be a Temp table. Divide'n'Conquer.

    The query I changed at work using such a method brought the query down from 30+ minutes to several seconds. And, no... preaggregating in a derived table will not accomplish the same thing although it will make an improvement.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • do you have statistics on the indexes which you defined. is the statistics are updated

  • Ya gotta trust me on this... perfectly up to date stats and indexes just aren't gonna help this one. Try the pre-aggregation I spoke of... you'll be amazed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/10/2009)


    Ya gotta trust me on this... perfectly up to date stats and indexes just aren't gonna help this one. Try the pre-aggregation I spoke of... you'll be amazed.

    I am totally intrigued by this Jeff. I had, sadly, never considered that before and have some that could potentially benefit from doing that. I'll have to try this, and soon. 🙂

    Thanks!

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Just remember... you must include the columns in the GROUP BY that you intend to join on. If you return a heck of a lot of rows in the temp table, add the appropriate index... try clustered and non-clustered depending on what you're trying to join on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    correct me if i am wrong here but what you are suggesting is that I aggregate that temp table I have and dump it into a #temp table before joining it onto the other tables is that correct? That means I create the aggregrated table in my query and then join it to the other tables to get my real query?

    Thanks

  • From a programmer's point of view, a superficial look at your code suggests that you're asking SQL Server to do lots of function calls for each of your million rows.

    And the function calls use the same values each time. I don't know how clever SQL Server is at optimising such things but it will do no harm to offer it some help.

    So, at the beginning

    DECLARE @MonthEnd datetime -- Or a more meaningful name

    SET @MonthEnd = DATEADD(yy, - 1, DATEADD(mm, DATEDIFF(m, 0, @mth) + 1, - 1))

    Then, obviously, use that variable in your comparisons.

    By the way, m and mm do mean the same thing for DATEADD, DATEPART but it's much better to be consistent in a single statement at least.

    Next, replace lines such as

    AND glbbm.gl_company_code IN (SELECT * FROM fn_rs_string_to_varchar_table(@Company, ','))

    with

    INNER JOIN fn_rs_string_to_varchar_table(@Company, ',') Companies ON glbbm.gl_company_code = Companies.xxxx

    where xxxx is the name of the column returned by your function.

  • Have you used the Database Tunning Advisor?

  • hi dmw,

    Using an inner join doesn't always give you better performance!!!

    Sometimes using and IN or EXISTS statement in the where clause does cause SQL to use a "Semi Join" which performs much better than an INNER JOIN.

    This needs to be tested of coarse...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • My gut feel is that the function is, if not the root of the problem, obscuring the root of the problem.

    Can we see the code for the function? We may be able to suggest an alternative. The tables used by multi-statement user defined functions are table variables. As such, they cannot be indexed and they don't have statistics. That lack of stats is throwing all the cost estimations way, way off.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 16 through 30 (of 46 total)

You must be logged in to reply to this topic. Login to reply