March 16, 2016 at 1:47 pm
I am continually plagued by performance problems. When I display the estimated execution plan, the estimated cost and number of rows are usually way off, the plans it chooses are often really poor, and it often suggests new indexes that don't help.
Right now I have a specific recurring issue I'd like to throw out there. I have a 500,000 row table with a two-column clustered primary key, let's call them A and B. I have some queries that join to this table on those two columns and they take an hour to run, returning about 25000 rows. SQL Server suggests I need a new nonclustered index on column B, and include column A. When I drop the primary key entirely, and don't create a new index, the query runs in a few seconds.
First, why would it suggest that index, and second, why does a primary key kill the query performance? I've already tried rebuilding indexes and updating stats with no improvement. I also tried creating that suggested index and it didn't help; it just came up with another suggested index that doesn't help either.
We're on 2008 R2. Thanks.
March 16, 2016 at 1:49 pm
Have you tried updating the statistics on that table?
March 16, 2016 at 1:54 pm
i'd like to see the actual query and an execution plan;
you might have a non-Sargable query, functions in the Where clause, a catch all query containing a lot of OR statements, or something else that is preventing a decent plan and performance.
i have lots of queries joining multiple tables with tens of millions of rows, and they don't perform poorly.
if you can provide more details, i'm willing to bet we can offer some solid guidance.
Lowell
March 16, 2016 at 5:36 pm
Thanks, yes, I updated statistics. In the interest of being able to show it conveniently here I reduced the query down to a very simple example. The original was more complicated and had a where clause of just a single column equal to a constant that has no effect on the issue.
I've attached two plan images. The first one is without a primary key; you will see a full table scan on the EKPO table. The second with a primary key (on columns EBELN and EBELP) has a clustered index scan. From what I see here I don't think the access of the EKPO table is the issue; you will see that the access to the MRP_RESULT_FACT table changed. This is what I see all the time -- a change is made to a table or query and triggers the optimizer to choose a completely different plan, changing things that had nothing to do with the actual change that was made.
SELECT
PURCH_DOC_FACT.PURCH_DOC_ID
,PURCH_DOC_FACT.PURCH_LINE_ITEM_ID
,PURCH_DOC_FACT.MATERIAL_ID
,PURCH_DOC_FACT.PURCH_DOC_DATE
,EKPO.ADPRI
,MRP_RESULT_FACT.MRP_DATE
FROM
PURCH_DOC_FACT
INNER JOIN EKPO ON (PURCH_DOC_FACT.PURCH_DOC_ID = EKPO.EBELN
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = EKPO.EBELP)
LEFT OUTER JOIN MRP_RESULT_FACT ON (PURCH_DOC_FACT.PURCH_DOC_ID = MRP_RESULT_FACT.MRP_ELEM_NUM
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = MRP_RESULT_FACT.MRP_ELEM_LINE_NUM)
March 16, 2016 at 8:23 pm
smithhead (3/16/2016)
Thanks, yes, I updated statistics. In the interest of being able to show it conveniently here I reduced the query down to a very simple example. The original was more complicated and had a where clause of just a single column equal to a constant that has no effect on the issue.I've attached two plan images. The first one is without a primary key; you will see a full table scan on the EKPO table. The second with a primary key (on columns EBELN and EBELP) has a clustered index scan. From what I see here I don't think the access of the EKPO table is the issue; you will see that the access to the MRP_RESULT_FACT table changed. This is what I see all the time -- a change is made to a table or query and triggers the optimizer to choose a completely different plan, changing things that had nothing to do with the actual change that was made.
SELECT
PURCH_DOC_FACT.PURCH_DOC_ID
,PURCH_DOC_FACT.PURCH_LINE_ITEM_ID
,PURCH_DOC_FACT.MATERIAL_ID
,PURCH_DOC_FACT.PURCH_DOC_DATE
,EKPO.ADPRI
,MRP_RESULT_FACT.MRP_DATE
FROM
PURCH_DOC_FACT
INNER JOIN EKPO ON (PURCH_DOC_FACT.PURCH_DOC_ID = EKPO.EBELN
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = EKPO.EBELP)
LEFT OUTER JOIN MRP_RESULT_FACT ON (PURCH_DOC_FACT.PURCH_DOC_ID = MRP_RESULT_FACT.MRP_ELEM_NUM
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = MRP_RESULT_FACT.MRP_ELEM_LINE_NUM)
Any chance of you posting the actual execution plans by saving the actual plans to file instead of snipping pictures? It would be interesting to see how many scans are actually being done.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 16, 2016 at 9:32 pm
What about the other tables in the query?
What indexes do they have?
Are there any FK constraints between these tables?
_____________
Code for TallyGenerator
March 17, 2016 at 2:50 am
There's no warning on the result operator to back this up (if the warning string is too long to display in the "hover over" property sheet, it can still be seen in the right-click property sheet), but that pattern of compute scalars, one for each table, between read and join, from a query which has eccentric performance characteristics, is often synonymous with implicit conversions.
What are the data types of the join columns in each table?
Use table aliasing to reduce noise in your queries:
SELECT
pd.PURCH_DOC_ID
,pd.PURCH_LINE_ITEM_ID
,pd.MATERIAL_ID
,pd.PURCH_DOC_DATE
,EKPO.ADPRI
,r.MRP_DATE
FROM PURCH_DOC_FACT pd
INNER JOIN EKPO
ON (pd.PURCH_DOC_ID = EKPO.EBELN
AND pd.PURCH_LINE_ITEM_ID = EKPO.EBELP)
LEFT OUTER JOIN MRP_RESULT_FACT r
ON (pd.PURCH_DOC_ID = r.MRP_ELEM_NUM
AND pd.PURCH_LINE_ITEM_ID = r.MRP_ELEM_LINE_NUM)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 17, 2016 at 9:36 am
Thanks everyone. I can post more details later, but here are some quick comments/questions:
Good to know about compute scalars, I was wondering what that meant.
What do you mean by "warning on the result operator"?
There are no foreign keys in this database. (I didn't create this database.)
I generally use table aliasing; this query was auto-generated by a BI tool (but with a little cleanup by myself).
At any rate, I very much appreciate your expertise and will spend some more time looking at those things, but my main goal with this post is more general. I've used other databases for over 20 years and have some tuning experience, but I am relatively new to SQL Server and have never been so frustrated. It continually makes horrible plan choices, and query or table changes often trigger a strategy change that has nothing whatsoever to do with the change that was made. Its estimates are often way off, but supposedly we rebuild indexes and update statistics every day (I am not the DBA). Could there be something wrong with our configuration or management, or is the optimizer for 2008R2 just really bad? (We are on a path to upgrade in a few months so maybe that will help some?)
Thanks.
March 17, 2016 at 9:41 am
smithhead (3/17/2016)
Thanks everyone. I can post more details later, but here are some quick comments/questions:Good to know about compute scalars, I was wondering what that meant.
What do you mean by "warning on the result operator"?
There are no foreign keys in this database. (I didn't create this database.)
I generally use table aliasing; this query was auto-generated by a BI tool (but with a little cleanup by myself).
At any rate, I very much appreciate your expertise and will spend some more time looking at those things, but my main goal with this post is more general. I've used other databases for over 20 years and have some tuning experience, but I am relatively new to SQL Server and have never been so frustrated. It continually makes horrible plan choices, and query or table changes often trigger a strategy change that has nothing whatsoever to do with the change that was made. Its estimates are often way off, but supposedly we rebuild indexes and update statistics every day (I am not the DBA). Could there be something wrong with our configuration or management, or is the optimizer for 2008R2 just really bad? (We are on a path to upgrade in a few months so maybe that will help some?)
Thanks.
Any chance you could post up one of those execution plans? As a .sqlplan attachment? There are plenty of questions here which could be answered with just a quick peek...
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 17, 2016 at 9:56 am
What is the data type of the keys?
March 17, 2016 at 11:56 am
You need to review the indexes from scratch. That is, analyze SQL's index usage stats, missing index stats and index operational stats. The longer time period you have these stats for the better, as they will better reflect the actual usage of the table and its indexes.
Hopefully you have another copy of the db, either live or as a backup, where you haven't dropped the existing pk yet, so you can still see all the stats on that index too. When an index is dropped, the corresponding stats are also dropped (except perhaps missing index stats, that one depends).
By analyzing the stats you can determine:
1) the best clustered index on every table -- this is the most critical factor in performance.
2) any nonclus index(es) you still need, with included column(s).
There is an iterative process. But it gets better as it goes along.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 17, 2016 at 2:51 pm
smithhead (3/16/2016)
Thanks, yes, I updated statistics. In the interest of being able to show it conveniently here I reduced the query down to a very simple example. The original was more complicated and had a where clause of just a single column equal to a constant that has no effect on the issue.I've attached two plan images. The first one is without a primary key; you will see a full table scan on the EKPO table. The second with a primary key (on columns EBELN and EBELP) has a clustered index scan. From what I see here I don't think the access of the EKPO table is the issue; you will see that the access to the MRP_RESULT_FACT table changed. This is what I see all the time -- a change is made to a table or query and triggers the optimizer to choose a completely different plan, changing things that had nothing to do with the actual change that was made.
SELECT
PURCH_DOC_FACT.PURCH_DOC_ID
,PURCH_DOC_FACT.PURCH_LINE_ITEM_ID
,PURCH_DOC_FACT.MATERIAL_ID
,PURCH_DOC_FACT.PURCH_DOC_DATE
,EKPO.ADPRI
,MRP_RESULT_FACT.MRP_DATE
FROM
PURCH_DOC_FACT
INNER JOIN EKPO ON (PURCH_DOC_FACT.PURCH_DOC_ID = EKPO.EBELN
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = EKPO.EBELP)
LEFT OUTER JOIN MRP_RESULT_FACT ON (PURCH_DOC_FACT.PURCH_DOC_ID = MRP_RESULT_FACT.MRP_ELEM_NUM
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = MRP_RESULT_FACT.MRP_ELEM_LINE_NUM)
Just an FYI, A table scan on a heap (no clustered index) and a clustered index scan are actually the same thing, a table scan.
March 17, 2016 at 3:18 pm
Lynn Pettis (3/17/2016)
smithhead (3/16/2016)
Thanks, yes, I updated statistics. In the interest of being able to show it conveniently here I reduced the query down to a very simple example. The original was more complicated and had a where clause of just a single column equal to a constant that has no effect on the issue.I've attached two plan images. The first one is without a primary key; you will see a full table scan on the EKPO table. The second with a primary key (on columns EBELN and EBELP) has a clustered index scan. From what I see here I don't think the access of the EKPO table is the issue; you will see that the access to the MRP_RESULT_FACT table changed. This is what I see all the time -- a change is made to a table or query and triggers the optimizer to choose a completely different plan, changing things that had nothing to do with the actual change that was made.
SELECT
PURCH_DOC_FACT.PURCH_DOC_ID
,PURCH_DOC_FACT.PURCH_LINE_ITEM_ID
,PURCH_DOC_FACT.MATERIAL_ID
,PURCH_DOC_FACT.PURCH_DOC_DATE
,EKPO.ADPRI
,MRP_RESULT_FACT.MRP_DATE
FROM
PURCH_DOC_FACT
INNER JOIN EKPO ON (PURCH_DOC_FACT.PURCH_DOC_ID = EKPO.EBELN
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = EKPO.EBELP)
LEFT OUTER JOIN MRP_RESULT_FACT ON (PURCH_DOC_FACT.PURCH_DOC_ID = MRP_RESULT_FACT.MRP_ELEM_NUM
AND PURCH_DOC_FACT.PURCH_LINE_ITEM_ID = MRP_RESULT_FACT.MRP_ELEM_LINE_NUM)
Just an FYI, A table scan on a heap (no clustered index) and a clustered index scan are actually the same thing, a table scan.
Almost, except you won't get a merge join from a table scan without a sort first.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
March 18, 2016 at 4:54 pm
That's not really a lot of records. I'm in the data type mismatch of your join fields camp, but also, how much memory does your server have and what else is it doing? Could your instance be starved for memory?
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply