Further optimizing a production query

  • Hi guys, I have been optimizing queries for quite a bit now, and I thought i had most answers, but I'm caught there:

    The development team asked for a query to fill their drop down list, and I cannot happen to lower the query time below 1500ms. The users complain it's too slow to load the screen because this query is slow.

    But it is understandable because there are only 800 product categories, yes, but they want only the categories that have products, and when I add a distinct operator, it slows it down.

    I'll post my query, and my plan, but I cannot post sample data, since the query searches a few tables.

    USE GICSPF

    GO

    DBCC dropcleanbuffers

    DBCC FREEPROCCACHE

    GO

    DECLARE @date DATETIME

    SELECT @date = GETDATE()

    SELECT DISTINCT

    vd.CategoryID,

    vd.CategoryName,

    vm.MfrID

    FROM GICSPF.dbo.GICSPFVendorCategoryHeader vh

    INNER JOIN GICSPF.dbo.GICSPFVendorCategoryDetail vd ON vh.CompanyID = vd.CompanyID

    AND vh.DivisionID = vd.DivisionID

    AND vh.DepartmentID = vd.DepartmentID

    AND vh.VendorCategoryID = vd.VendorCategoryID

    INNER JOIN GICSPF.dbo.GICSPFproduct p ON vd.CompanyID = p.CompanyID

    AND vd.DivisionID = p.DivisionID

    AND vd.DepartmentID = p.DepartmentID

    AND vd.CategoryID = p.VndCategory

    INNER JOIN GICSPF.dbo.GICSPFVendorToMfr vm ON p.CompanyID = vm.CompanyID

    AND p.DivisionID = vm.DivisionID

    AND p.DepartmentID = vm.DepartmentID

    AND p.MfrVendorID = vm.MfrVendorID

    WHERE vh.VendorID = 14

    SELECT DATEDIFF(ms, @date, GETDATE())

    There are about 800 concerned categories (for this vendor)

    There are about 35 000 products (for this vendor).

    So it needs to match every of the 35K products to see if the categories is used within these products, and then further match the product Manufacturer (which is unique by Category).

    Looking at the query plan, I don't see anything to optimize, all there is are nested loops (which I think are the faster) and index seek (my indexes work great).

    What can we do to further optimize that query?

    Thanks in advance,

    J-F

    Cheers,

    J-F

  • I would alter the distinct to a group by on all columns.

    Group by may take advantage of existing indexes.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Hi,

    Thanks for the reply, but it has the same exact execution time, and it used the same query plan (still says it selects distinct records). I think the query gets rewritten by the compiler and says it's a distinct keyword.

    Anything else I can try?

    thanks,

    J-F

    Cheers,

    J-F

  • Not really sure, but do you have indexes like these on the table in the query? The names used for the indexes are mine.

    create nonclustered index IX_vgh_ProductsQry1

    on dbo.GICSPFVendorCategoryHeader (

    CompanyID,

    DivisionID,

    DepartmentID,

    VendorCategoryID

    ) on [default];

    create nonclustered index IX_vcd_ProductsQry1

    on dbo.GICSPFVendorCategoryDetail (

    CompanyID,

    DivisionID,

    DepartmentID,

    VendorCategoryID

    ) include (

    CategoryID,

    CategoryName

    )on [default];

    create nonclustered index IX_p_ProductsQry1

    on dbo.GICSPFproduct (

    CompanyID,

    DivisionID,

    DepartmentID,

    VndCategory

    ) on [default];

    create nonclustered index IX_vtm_ProductsQry1

    on dbo.GICSPFVendorToMfr (

    CompanyID,

    DivisionID,

    DepartmentID,

    MfrVendorID

    ) include (

    MfrID

    ) on [default];

    I don't know if these would help or not, but the indexes have all the columns defined that are used in the query. All I can say at this point is it is worth a try.

  • Try this

    SELECTvd.CategoryID,

    vd.CategoryName,

    vm.MfrID

    FROM(

    SELECTCompanyID,

    DivisionID,

    DepartmentID,

    VendorCategoryID

    FROMGICSPF.dbo.GICSPFVendorCategoryHeader

    WHEREVendorID = 14

    ) AS vh

    INNER JOIN(

    SELECTCompanyID,

    DivisionID,

    DepartmentID,

    VendorCategoryID,

    CategoryID,

    CategoryName

    FROMGICSPF.dbo.GICSPFVendorCategoryDetail

    ) AS vd ON vd.CompanyID = vh.CompanyID

    AND vd.DivisionID = vh.DivisionID

    AND vd.DepartmentID = vh.DepartmentID

    AND vd.VendorCategoryID = vh.VendorCategoryID

    INNER JOIN(

    SELECTCompanyID,

    DivisionID,

    DepartmentID,

    VndCategory,

    MfrVendorID

    FROMGICSPF.dbo.GICSPFproduct

    ) AS p ON p.CompanyID = vd.CompanyID

    AND p.DivisionID = vd.DivisionID

    AND p.DepartmentID = vd.DepartmentID

    AND p.VndCategory = vd.CategoryID

    INNER JOIN(

    SELECTCompanyID,

    DivisionID,

    DepartmentID,

    MfrVendorID

    FROMGICSPF.dbo.GICSPFVendorToMfr

    ) AS vm ON vm.CompanyID = p.CompanyID

    AND vm.DivisionID = p.DivisionID

    AND vm.DepartmentID = p.DepartmentID

    AND vm.MfrVendorID = p.MfrVendorID

    GROPU BYvd.CategoryID,

    vd.CategoryName,

    vm.MfrID

    OPTION(FORCE ORDER)

    To make this work as expected, you should rearrange the derived tables in the order of amount of record in the table with least amount in the FROM part, and the most amount as last derived table.


    N 56°04'39.16"
    E 12°55'05.25"

  • Hey, thanks for your input all of you,

    The index you proposed are about the same as mine, I already got some covering indexes there. But thanks!

    For the "Force Order", well it helped, the thing I hate is, I do not know why it went from about 1.5secs to about just under a second, 0.9 secs, approximately. It sure helps. I did not know it could really help.

    I guess this is the further we can go, it still will not please the development team, but I'll try to push it even further another time. Anyone has a concrete explanation of the "option (force order)" keyword?

    Thanks in advance,

    J-F

    Cheers,

    J-F

  • btw

    did you check the plan using a @variable in stead of the hard coded

    WHERE vh.VendorID = 14

    Maybe vendorid 14 just generates a less optimal (non representative) response time.

    declare @theId integer

    set @TheId = 14

    Select ....

    ...

    WHERE vh.VendorID = @TheId

    :Whistling:

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Hey, I had already declared the variable that way, it does not help the query, thanks for the try!

    J-F

    Cheers,

    J-F

  • >>but they want only the categories that have products,

    Doesn't that indicate the need for an EXISTS clause, not a direct join?? Exists would be orders of magnitude faster I would think.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • The estimated stats are not representative of the actuals ... the topmost nested loop join estimated 1 row but actually returned 36773. Outdated stats or parameter sniffing are the most likely culprits.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply