Extra join is killing query

  • Hi Guys

    I have a query that takes around 1 second to run until I join another table then it takes around 6 minutes.

    There is nothing unusual about the table I am joining, it joins without drama in other queries and I have indexed the buggery out of it. The query is below with the problem join to the OwnershipRoleDiscipline table commented out at the end. I have the query plans but cannot see how to attach them to a post (I am new here). One thing I notice from looking at them is that the Key Lookup for the ClientProductProfile table uses 25% of resources for the slow query and 3% for the fast query, but it is not actually even involved in the problem join. Somehow the join seems to have thrown the whole execution plan haywire and I do not know how to fix it. Any ideas would be appreciated.

    Regards

    David

    SELECT cpa.ClientID, p2.Description AS Product, ppc2.ProfileColour

    FROM profile.ProfileValueColour ppc

    JOIN Profile.ClientProductProfile cpa

    ON cpa.ProfileValue = ppc.Actual

    AND cpa.ProfileTypeID = 13

    AND ppc.ProductID = cpa.ProductID

    JOIN Profile.ClientProductProfile cpp

    ON cpp.ProductID = cpa.ProductId

    AND cpp.ProfileValue = ppc.Potential

    AND cpp.ProductID = ppc.ProductID

    AND cpa.clientid = cpp.ClientID

    AND cpp.ProfileTypeID = 14

    JOIN dbo.Product p

    ON p.ProductID = ppc.ProductID

    join Profile.ProfileProductRule ppr

    on ppc.ownershiproleid = ppr.ownershiproleid

    and ppc.productid = ppr.productid

    AND ppc.ForDisplay = 1

    AND ppc.ownershiproleid = 1

    JOIN Profile.ClientProductProfile cpa2

    ON cpa.clientid = cpa2.ClientID

    AND cpa2.ProfileTypeID = 13

    JOIN Profile.ClientProductProfile cpp2

    ON cpa2.clientid = cpp2.ClientID

    AND cpp2.ProfileTypeID = 14

    JOIN profile.ProfileValueColour ppc2

    ON ppc2.ProductID = cpa2.ProductID

    AND ppc2.ProductID = cpp2.ProductID

    AND cpp2.ProfileValue = ppc2.Potential

    AND cpa2.ProfileValue = ppc2.Actual

    AND cpp2.ProductID = ppc2.ProductID

    AND ppc2.OwnershipRoleID = 1

    JOIN product p2

    ON ppc2.ProductID = p2.ProductID

    JOIN dbo.OwnershipPosition op

    ON op.OwnershipRoleID = ppc.OwnershipRoleID

    JOIN dbo.CompanyPosition cp

    ON cp.CompanyPositionID = op.CompanyPositionID

    AND op.OwnershipPositionID = 92

    JOIN dbo.OwnershipRole owr

    ON owr.OwnershipRoleID = op.OwnershipRoleID

    JOIN dbo.Geography g1

    ON g1.GeographyID = cp.GeographyID

    AND g1.GeographyTypeID = 6

    JOIN dbo.GeographyHierarchy gh1

    ON g1.GeographyID = gh1.ParentGeographyID

    AND gh1.GeographyHierarchyTypeID = owr.GeographyHierarchyTypeID

    JOIN dbo.Geography g2

    ON g2.GeographyID = gh1.ChildGeographyID

    AND g2.GeographyTypeID = 7

    JOIN dbo.GeographyHierarchy gh2

    ON g2.GeographyID = gh2.ParentGeographyID

    AND gh2.GeographyHierarchyTypeID = owr.GeographyHierarchyTypeID

    JOIN dbo.Geography g3

    ON g3.GeographyID = gh2.ChildGeographyID

    AND g3.GeographyTypeID = 8

    JOIN dbo.Address a

    ON a.GeographyBrickReferenceID = g3.ReferenceID

    JOIN client c

    ON c.PrimaryAddressId = a.AddressID

    AND a.AddressStatusID > 0

    AND c.ClientStatusID > 0

    AND cpa.clientid = c.ClientID

    JOIN dbo.Person per

    ON per.PersonID = c.ClientID

    AND per.StatusID > 0

    --JOIN dbo.OwnershipRoleDiscipline ord

    --ON ord.DisciplineID = c.DisciplineID

    --AND ord.OwnershipRoleID = owr.OwnershipRoleID

    --AND ord.SRAGeographyID = g2.GeographyID

  • You're going to have to learn how to add attachments so that you can add the query plans (good and bad).

    When you are creating a post, look at the bottom right hand side of your screen for a button that says 'Edit Attachments' – once you find that, the rest should be easy πŸ™‚

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • probably a join algorithim issue might be fixed using a query hint. but without looking at the execution plan its impossible to say. also how big is the dbo.OwnershipRoleDiscipline does it have stats for all the columns used in the query ?

    Jayanth Kurup[/url]

  • The original query appears to be "For each client, get all the products and all their colours". The other tables in the query could filter the output by geography but lets start as simple as possible. Retaining only those tables which are required for the output columns and joins thereof, you get this:

    SELECT

    cpa.ClientID,

    p2.Description AS Product,

    ppc2.ProfileColour

    FROM Profile.ClientProductProfile cpa -- 1

    INNER JOIN Profile.ClientProductProfile cpa2 -- 2

    ON cpa2.ClientID = cpa.clientid

    AND cpa2.ProfileTypeID = 13

    INNER JOIN Profile.ClientProductProfile cpp2 -- 2

    ON cpp2.ClientID = cpa.clientid

    AND cpp2.ProfileTypeID = 14

    INNER JOIN profile.ProfileValueColour ppc -- 2

    ON ppc.Actual = cpa.ProfileValue

    AND ppc.ProductID = cpa.ProductID

    INNER JOIN profile.ProfileValueColour ppc2 -- 1

    ON ppc2.ProductID = cpa2.ProductID

    AND ppc2.Actual = cpa2.ProfileValue

    AND ppc2.ProductID = cpp2.ProductID

    AND ppc2.Potential = cpp2.ProfileValue

    AND ppc2.OwnershipRoleID = 1

    INNER JOIN dbo.product p2 -- 1

    ON p2.ProductID = ppc2.ProductID

    WHERE cpa.ProfileTypeID = 13

    which looks to me to be hopelessly complicated for such a simple task. I'd recommend you revisit the ERD and confirm that what you've got is correct before adding further tables to the mix.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I would add that you are mixing join conditions with criteria. When joining tables, you almost always want to use only (implicit) foreign keys, because the joins should reflect how the tables are related. This makes it much easier for other people to understand your queries, because it separates out the relationships from the criteria. This is the whole reason that the old style joins were deprecated, but you've gone to the opposite extreme.

    This can be relaxed for OUTER JOINS when you want a filter to be applied before the join, but you are only using INNER JOINs here.

    There is no way that cpa.ProfileTypeID = 13 constitutes part of a foreign key.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Here is what would help, sample data for the tables that is representative of the problem domain and the expected results of the query based on the sample data.

  • Hi Guys

    thanks for the replies.

    @Phil - found it. Fast and Slow query plans are now attached, hopefully someone smarter than me can discern something useful from them.

    @jayanth - I will research query hints, it sounds promising. The OwnershipRoleDiscipline table had 3500 records and I have created all stats and indexes suggested by the index tuning wizard (as well as several more out of desperation)

    @chrism-2 - Yes, it does look complicated and there may be a couple of unecessary joins, but it works fine without the addition of the OwnershipRoleDiscipline table. I should also point out that the actual query is a dynamic one, this query is the output of a section of it that after much pain I have narrowed down to be the problem.

    @drew - Sorry, not sure I follow. I am using the inner joins as the filter. The full query is actually dynamic and this is the output of part of that.

    @Lynne - Yes, it would, but I was hoping not to unless completely desperate as there are many tables here and some are very large. The performance issue may not be apparent for a sample 50 record table, but appear for the full 100000 record table, so I was hoping the (now attached) query plans would suffice.

    Thanks again for the feedback everyone, I really appreciate it.

  • DavidDroog (7/22/2015)


    Hi Guys

    thanks for the replies.

    @Phil - found it. Fast and Slow query plans are now attached, hopefully someone smarter than me can discern something useful from them.

    @jayanth - I will research query hints, it sounds promising. The OwnershipRoleDiscipline table had 3500 records and I have created all stats and indexes suggested by the index tuning wizard (as well as several more out of desperation)

    @chrism-2 - Yes, it does look complicated and there may be a couple of unecessary joins, but it works fine without the addition of the OwnershipRoleDiscipline table. I should also point out that the actual query is a dynamic one, this query is the output of a section of it that after much pain I have narrowed down to be the problem.

    @drew - Sorry, not sure I follow. I am using the inner joins as the filter. The full query is actually dynamic and this is the output of part of that.

    @Lynne - Yes, it would, but I was hoping not to unless completely desperate as there are many tables here and some are very large. The performance issue may not be apparent for a sample 50 record table, but appear for the full 100000 record table, so I was hoping the (now attached) query plans would suffice.

    Thanks again for the feedback everyone, I really appreciate it.

    I disagree on not having sample data and expected results. It would help with looking at your code and determining if there are alternative ways the query can be written that may provide a more robust and scalable than the your current code.

  • DavidDroog (7/22/2015)


    Hi Guys

    thanks for the replies.

    @drew - Sorry, not sure I follow. I am using the inner joins as the filter. The full query is actually dynamic and this is the output of part of that.

    JOIN dbo.address a

    ON a.geographybrickreferenceid = g3.referenceid

    JOIN client c

    ON c.primaryaddressid = a.addressid

    AND a.addressstatusid > 0

    AND c.clientstatusid > 0

    AND cpa.clientid = c.clientid Why did you put addressstatusid and clientstatusid conditions in the ON clause? They supposed to be in WHERE:

    JOIN dbo.address a

    ON a.geographybrickreferenceid = g3.referenceid

    JOIN client c

    ON c.primaryaddressid = a.addressid

    AND cpa.clientid = c.clientid

    WHERE a.addressstatusid > 0

    AND c.clientstatusid > 0

    DavidDroog (7/22/2015)


    @ChrisM - Yes, it does look complicated and there may be a couple of unecessary joins, but it works fine without the addition of the OwnershipRoleDiscipline table. I should also point out that the actual query is a dynamic one, this query is the output of a section of it that after much pain I have narrowed down to be the problem.

    Your OwnershipRoleDiscipline table is joined to 3 tables which is not very good handled by SQL Server. Because it decided to use this table to eliminate records on early stages, and simply makes a cardinality mistake. Even if you patch the query it may work for a while but then will break somewhere else. So I suggest you to rewrite the whole thing and eliminate unnecessary joins.

    Also instead of INNER JOIN you can try to do a LEFT JOIN with dbo.OwnershipRoleDiscipline. See if it helps.

    And sorry, but "my code is dynamic" is not an excuse for writing bad and not readable code.


    Alex Suprun

  • Hi Alex

    thanks for the reply.

    Are you saying SQL cannot handle 3 way joins? Are you also saying that a where clause is more efficient than a a join on clause?

    Sorry but I have never had any issues with either of these things.

    My inner join to OwnershipRoleDiscipline is required because that is how I am filtering the results, a left outer join defeats the purpose of the join.

    How is my code bad or unreadable?

    Regards

    David

  • DavidDroog (7/22/2015)


    Hi Alex

    thanks for the reply.

    Are you saying SQL cannot handle 3 way joins? Are you also saying that a where clause is more efficient than a a join on clause?

    Sorry but I have never had any issues with either of these things.

    My inner join to OwnershipRoleDiscipline is required because that is how I am filtering the results, a left outer join defeats the purpose of the join.

    How is my code bad or unreadable?

    Regards

    David

    Provide us with sample data for the tables (some should meet the join/where criteria and some should not), expected results based on the sample data.

    There may be a better way to write the query but without sample data and expected results we have no way to test any code we may write.

  • Hi Guys

    I have figured this out. My theory is that adding the extra table somehow screwed up SQL Servers execution plan so badly no amount of indexing etc was going to help, I just had to force it to reshuffle its execution plan ( I have actually seen this sort of thing before, but not to this degree). To to this I have utilised a temp table and then joined the problematic OwnershipRoleDiscipline table to the temp table. It is completely non-sensical, "bad" and "unreadable", but the query is down to under a second now.

    Lynn - sorry, but the logistics of scripting out all those tables and proving sets of usable, non sensitive data was more than I had time for.

    SELECT cpa.ClientID, p2.Description AS Product, ppc2.ProfileColour, c.DisciplineID, owr.OwnershipRoleID, g2.GeographyID INTO #tmp

    FROM profile.ProfileValueColour ppc

    JOIN Profile.ClientProductProfile cpa

    ON cpa.ProfileValue = ppc.Actual

    AND cpa.ProfileTypeID = 13

    AND ppc.ProductID = cpa.ProductID

    JOIN Profile.ClientProductProfile cpp

    ON cpp.ProductID = cpa.ProductId

    AND cpp.ProfileValue = ppc.Potential

    AND cpp.ProductID = ppc.ProductID

    AND cpa.clientid = cpp.ClientID

    AND cpp.ProfileTypeID = 14

    JOIN dbo.Product p

    ON p.ProductID = ppc.ProductID

    join Profile.ProfileProductRule ppr

    on ppc.ownershiproleid = ppr.ownershiproleid

    and ppc.productid = ppr.productid

    AND ppc.ForDisplay = 1

    AND ppc.ownershiproleid = 1

    JOIN Profile.ClientProductProfile cpa2

    ON cpa.clientid = cpa2.ClientID

    AND cpa2.ProfileTypeID = 13

    JOIN Profile.ClientProductProfile cpp2

    ON cpa2.clientid = cpp2.ClientID

    AND cpp2.ProfileTypeID = 14

    JOIN profile.ProfileValueColour ppc2

    ON ppc2.ProductID = cpa2.ProductID

    AND ppc2.ProductID = cpp2.ProductID

    AND cpp2.ProfileValue = ppc2.Potential

    AND cpa2.ProfileValue = ppc2.Actual

    AND cpp2.ProductID = ppc2.ProductID

    AND ppc2.OwnershipRoleID = 1

    JOIN product p2

    ON ppc2.ProductID = p2.ProductID

    JOIN dbo.OwnershipPosition op

    ON op.OwnershipRoleID = ppc.OwnershipRoleID

    JOIN dbo.CompanyPosition cp

    ON cp.CompanyPositionID = op.CompanyPositionID

    AND op.OwnershipPositionID = 92

    JOIN dbo.OwnershipRole owr

    ON owr.OwnershipRoleID = op.OwnershipRoleID

    JOIN dbo.Geography g1

    ON g1.GeographyID = cp.GeographyID

    AND g1.GeographyTypeID = 6

    JOIN dbo.GeographyHierarchy gh1

    ON g1.GeographyID = gh1.ParentGeographyID

    AND gh1.GeographyHierarchyTypeID = owr.GeographyHierarchyTypeID

    JOIN dbo.Geography g2

    ON g2.GeographyID = gh1.ChildGeographyID

    AND g2.GeographyTypeID = 7

    JOIN dbo.GeographyHierarchy gh2

    ON g2.GeographyID = gh2.ParentGeographyID

    AND gh2.GeographyHierarchyTypeID = owr.GeographyHierarchyTypeID

    JOIN dbo.Geography g3

    ON g3.GeographyID = gh2.ChildGeographyID

    AND g3.GeographyTypeID = 8

    JOIN dbo.Address a

    ON a.GeographyBrickReferenceID = g3.ReferenceID

    JOIN client c

    ON c.PrimaryAddressId = a.AddressID

    AND a.AddressStatusID > 0

    AND c.ClientStatusID > 0

    AND cpa.clientid = c.ClientID

    JOIN dbo.Person per

    ON per.PersonID = c.ClientID

    AND per.StatusID > 0

    SELECT ClientID, Product, ProfileColour

    FROM #tmp a

    JOIN dbo.OwnershipRoleDiscipline ord

    ON ord.DisciplineID = a.DisciplineID

    AND ord.OwnershipRoleID = a.OwnershipRoleID

    AND ord.SRAGeographyID = a.GeographyID

  • DavidDroog (7/22/2015)


    Are you also saying that a where clause is more efficient than a a join on clause?

    As far as SQL is concerned, ON clauses and WHERE clauses are equivalent for INNER JOINs. Humans, on the other hand, have problems when you jam everything into one place. The old style joins put everything in the WHERE clauses. You have gone the opposite extreme and placed everything in ON clauses. Both extremes are equally hard for humans to understand--and therefore update and maintain, because they obscure the difference between table relationships and criteria.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi Drew

    I am a human and I disagree. I find it much easier to follow when the clauses are with the table joins.

    To each their own...

    Regards

    David

  • DavidDroog (7/22/2015)


    Hi Drew

    I am a human and I disagree. I find it much easier to follow when the clauses are with the table joins.

    To each their own...

    Regards

    David

    Seriously? Look at your code again. How on earth did you put this condition for address table (a.addressstatusid > 0) after JOIN with the client table? What's the logic?

    JOIN dbo.address a

    ON a.geographybrickreferenceid = g3.referenceid

    JOIN client c

    ON c.primaryaddressid = a.addressid

    AND a.addressstatusid > 0

    AND c.clientstatusid > 0

    AND cpa.clientid = c.clientid

    You find it much easier to follow simply because you wrote this code. For sure it's easy for you, but you simply don't care if somebody else might have problems trying to read it.


    Alex Suprun

Viewing 15 posts - 1 through 15 (of 38 total)

You must be logged in to reply to this topic. Login to reply