Performance Tuning Question

  • In trying to get a better grip on performance tuning I have a following example which I was wondering if people would comment on.

    The code is as follows:

    Select 'All' as OpID , 'All Ops' as 'Op Name' , 1

    union all

    Select Distinct LEFT(Oprid,4), Name + ' (' + LEFT(Oprid,4) + ')', 2

    from Dynamicsv5Realtime.dbo.RouteOprTable

    where dataareaid ='AJB'

    and Oprid between '3000' and '4000'

    and LEFT (NAME,3) <> 'DNU'

    Order by 3,2

    I have also attached the plan, and the index is below.

    USE [DynamicsV5Realtime]

    GO

    /****** Object: Index [I_354OPRIDX] Script Date: 03/08/2013 10:41:13 ******/

    ALTER TABLE [dbo].[ROUTEOPRTABLE] ADD CONSTRAINT [I_354OPRIDX] PRIMARY KEY CLUSTERED

    (

    [DATAAREAID] ASC,

    [OPRID] ASC

    )WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    Is there anything that can be done to improve the query. What about the sorts and merge joins, can they be improved upon? How does the DISTINCT and LEFT function impact things, and should the index have an include for NAME?

    Thanks for any tips

  • It's not a bad idea to post the table definition, so it's possible to see what datatypes you select against and also how many rows you have in total and how it's distributed.

    The queryplan didn't look too bad in my eyes, but i guess it depends on how many rows you work with in actual real life.

    You use BETWEEN with strings, which might not yield what you actually want to achive, depending on how your data look. Also perhaps you don't need DISTINCT, but that's also depending on how your data looks 🙂

  • When you say how the rows are distributed what do you mean? I usually check in plans for estimated rows vs actual rows and if they are not close then look to update the statistics for that table.

  • To answer your questions:

    1. Merge Join - OK. For more info, check out this great article[/url] by Gail Shaw

    2. The distinct forces SQL to go through all your results, then omitting the duplicate records, using a CTE in this situation might help with performance on this and will accomplish the same thing.

    3. The LEFT function in your where clause causes a non-sargable predicate, typically increasing the odds of an index scan rather than a seek (or may even choose a different index altogether)

    4. Yes, you should include NAME in a non-clustered index (see Below)CREATE NONCLUSTERED INDEX [idx_DATAAREAID] ON dbo.RouteOprTable

    (

    [DATAAREAID] ASC,

    [OPRID] ASC

    ) INCLUDE (Name) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY] ;

    Try something like this (Disclaimer: may contain syntax errors :-D);WITH DATA AS (

    SELECT

    'All' [OpID],

    'All Ops' [Op Name],

    1 [Number],

    1 AS Cnt

    UNION ALL

    SELECT

    LEFT(Oprid, 4) [OpID],

    Name + ' (' + LEFT(Oprid, 4) + ')' [Op Name],

    2 [Number],

    ROW_NUMBER() OVER (PARTITION BY [OpID], [Op Name] ORDER BY [OpID], [Op Name]) AS Cnt

    FROM Dynamicsv5Realtime.dbo.RouteOprTable

    WHERE

    dataareaid = 'AJB'

    AND Oprid BETWEEN '3000' AND '4000'

    AND [OpName] LIKE 'DNU%'

    ORDER BY

    3, 2

    )

    SELECT OpID, [Op Name], [Number]

    FROM DATA

    WHERE Cnt = 1

    1. Biggest hitter in your plan is the DISTINCT SORT, you can get rid of this by using a CTE

    2. Clustered Index Seek is good, but you should create a non-clustered index on all columns needed to satisfy the query (see above)

    3. Change (AND LEFT(NAME, 3) <> 'DNU') to (AND [OpName] LIKE 'DNU%') or consider moving it out of the main select and use it in the CTE, this will allow for better odds getting a seek in the main query

    4. What data type is Oprid? Integer or varchar?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • For example if you have something like:

    SELECT x, y FROM T WHERE A = 5

    and index on column A, it might look like you're doing fine.

    But if table has million rows where column A is 5, this index might not do you that much good.

    But at least according to your QP, you had less than a couple of hundreds of rows, so it should be that bad?

  • Hi MyDoggie thanks for that some food for thought. To answer your question OPRID is nvarchar(10)

  • Okay. FYI the reason I asked about the data type for that column is because if it were a numeric data type and you placed quotes around it (effectively making it a string) it would cause IMPLICIT CONVERSIONS...which, if you have a lot of data in a table, will result in performance issues - best of luck!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (3/8/2013)


    Try something like this (Disclaimer: may contain syntax errors :-D);WITH DATA AS (

    SELECT

    'All' [OpID],

    'All Ops' [Op Name],

    1 [Number],

    1 AS Cnt

    UNION ALL

    SELECT

    LEFT(Oprid, 4) [OpID],

    Name + ' (' + LEFT(Oprid, 4) + ')' [Op Name],

    2 [Number],

    ROW_NUMBER() OVER (PARTITION BY [OpID], [Op Name] ORDER BY [OpID], [Op Name]) AS Cnt

    FROM Dynamicsv5Realtime.dbo.RouteOprTable

    WHERE

    dataareaid = 'AJB'

    AND Oprid BETWEEN '3000' AND '4000'

    AND [OpName] LIKE 'DNU%'

    ORDER BY

    3, 2

    )

    SELECT OpID, [Op Name], [Number]

    FROM DATA

    WHERE Cnt = 1

    Original was LEFT(OPName,3) != 'DNU'. The equivalent like should be OPName NOT LIKE 'DNU%'

    1. Biggest hitter in your plan is the DISTINCT SORT, you can get rid of this by using a CTE

    Except you've replaced the sort from the distinct with a sort from the windowing function. Better? Can't say without testing. Maybe. Maybe not.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the clarification Gail, I missed the != when I was editing it 🙂

    Regarding the DISTINCT, yep, always test. As we all know what works great for a table with a few thousand rows isn't always the best choice for those tables with millions!

    Thanks for going easy on me today!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply