November 6, 2011 at 10:01 am
This is a very wierd, major performance issue so any help/pointers, grately appreciated.
We migrating from SQL 2005 to SQL 2008R2 SP1 64-bit, and use Navision, so we don't have full control over the code.
For all but 1 table, when an sp_cursoropen is run it is very quick, but for 1 table, it takes 50 seconds. over 3 million writes and several thousand reads.
On another server with the same SQL version and database it runs this exact same statement (as per SQL profiler) with only 70 reads. The table has over 500k records and is over 1.5GB in total size, so I guess teh sp_cursoropen defines the cursor rather than runs the query, as given the huge reads/writes I'm thinking it is running the query, but on the other servers it is not doing this.
All SQLs are the same with same options, the only difference is the slow server is Windows 2008 R2 Standard SP1 and the fast ones are Windows 2008 R2 Enterprise SP1.
So the main question is, how would I investigate further an sp_cursoropen performing many reads when it doesn't perform many reads on other servers.
The execution plans are identical and look correct.
We have rebuilt the indexes and updated statistics and even read the entire table into cache first, but I can see that only 2.7GB of SQL data cache is used, so its not a RAM issue as such.
Thanks in Adance.
November 6, 2011 at 2:55 pm
sotn (11/6/2011)
This is a very wierd, major performance issue so any help/pointers, grately appreciated.We migrating from SQL 2005 to SQL 2008R2 SP1 64-bit, and use Navision, so we don't have full control over the code.
For all but 1 table, when an sp_cursoropen is run it is very quick, but for 1 table, it takes 50 seconds. over 3 million writes and several thousand reads.
On another server with the same SQL version and database it runs this exact same statement (as per SQL profiler) with only 70 reads. The table has over 500k records and is over 1.5GB in total size, so I guess teh sp_cursoropen defines the cursor rather than runs the query, as given the huge reads/writes I'm thinking it is running the query, but on the other servers it is not doing this.
All SQLs are the same with same options, the only difference is the slow server is Windows 2008 R2 Standard SP1 and the fast ones are Windows 2008 R2 Enterprise SP1.
So the main question is, how would I investigate further an sp_cursoropen performing many reads when it doesn't perform many reads on other servers.
The execution plans are identical and look correct.
We have rebuilt the indexes and updated statistics and even read the entire table into cache first, but I can see that only 2.7GB of SQL data cache is used, so its not a RAM issue as such.
As far as I remember sp_cursoropen defines the cursor and also includes the code that populates it therefore the cursor gets populated during the call.
Have you traced the actual sql statement?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.November 7, 2011 at 6:12 am
Ok I've managed to find further info.
There is a slight difference in the execution plans, which becomes obvious when I include the showplan xml statistics profile and not just the showplan xml in sql profiler.
The actual query begin performed is this
declare @p1 int
set @p1=180150113
declare @p3 int
set @p3=16
declare @p4 int
set @p4=1
declare @p5 int
set @p5=6
exec sp_cursoropen @p1 output,N'SELECT *,DATALENGTH("Picture") FROM "Database1"."dbo"."Item" WHERE "No_">=@P1 ORDER BY "No_" ',@p3 output,@p4 output,@p5 output,N'@P1 varchar(20)',''
select @p1, @p3, @p4, @p5
This query runs perfectly on other servers.
The 'slow' server instead of having an execution plan of
Clusted Index Insert <- Compute Scalar <- Compute Scalar <- Clustered Index Seek
Where the only 'wide' arrow is between the 2 Compute Scalar items
Has this plan instead
Clustered Index Insert <- Sequence Project <- Segment <- Compute Scalar <- Clustered Index Seek
Where every arrow is wide, so lots of data.
On another 'slow' server we ran an update statistics the ITEM and that seems to have fixed that server as it now gives a fast response and it's execution plan matches the fast servers now.
The problem now is that with this specific server, updating the statistics has had no effect, the same poor execution plan is still used.
Any ideas as to how to proceed, we cannot change the query, and as it works perfectly well on other server there is no need, so its more an issue with this table on this server?
November 7, 2011 at 6:20 am
Think about how this needs to access the data.
The server needs to access all the skus greater than 18...%.
That might mean the whole table.
Then it needs to open all the lob pages of all the images of the said returned rows. Calculate the size of that image and then perform and order by.
I have navision here as well and I have the same query running. The simple difference might be that we don't have many pictures in our system.
Unfortunately the only way I can suggest to solve this is to rewrite the query, which you can't do in Navision.
November 7, 2011 at 6:27 am
Indeed, but as this query against the original copy of this database on server which is a much slower server (on paper) runs fine, then I know its nothing to do with table/record size or indexes as they are all the same so updating statistics should be all I need to do to replicate the same or better performance I would have thought.
Curious though, you have a name of Ninja but a picture of a Jedi 🙂
November 7, 2011 at 6:36 am
Yes but the sql version is different. The query engine might choose a different plan and aside from plan guides there's nothing you can do about that.
Nothing up with the ninja vs jedi picture.
November 8, 2011 at 9:46 am
sotn (11/6/2011)
This is a very wierd, major performance issue so any help/pointers, grately appreciated.We migrating from SQL 2005 to SQL 2008R2 SP1 64-bit, and use Navision, so we don't have full control over the code.
For all but 1 table, when an sp_cursoropen is run it is very quick, but for 1 table, it takes 50 seconds. over 3 million writes and several thousand reads.
On another server with the same SQL version and database it runs this exact same statement (as per SQL profiler) with only 70 reads. The table has over 500k records and is over 1.5GB in total size, so I guess teh sp_cursoropen defines the cursor rather than runs the query, as given the huge reads/writes I'm thinking it is running the query, but on the other servers it is not doing this.
All SQLs are the same with same options, the only difference is the slow server is Windows 2008 R2 Standard SP1 and the fast ones are Windows 2008 R2 Enterprise SP1.
So the main question is, how would I investigate further an sp_cursoropen performing many reads when it doesn't perform many reads on other servers.
The execution plans are identical and look correct.
We have rebuilt the indexes and updated statistics and even read the entire table into cache first, but I can see that only 2.7GB of SQL data cache is used, so its not a RAM issue as such.
Thanks in Adance.
1) you rebuilt indexes so that addresses that requirement, but when you updated stats, did you do so with an EXPLICIT FULL SCAN on EVERY statistic in the database? That is mandatory I think on a version upgrade
2) have you checked CU1, 2 and 3 for 2008 R2 SP1 for any relevant query plan regression fixes that might apply to this scenario?
3) given that different server with exact same sql version runs query properly, have you flushed the procedure cache to check for parameter sniffing issue?
4) you could force the issue by using a plan guide I think, but that could really screw you if the input parameter(s) change
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 8, 2011 at 10:01 am
I rebuilt indexes/update stats with fullscan etc.
But, I then to test, copied the content to another table, truncated the original 700k row table, and copied 10k back and ran the application, then loaded the remaining 690k and it works but I cannot get it to use the correct plan without doing this step, so not sure if it would come back. This happens if I restart SQL completely or copy to other servers etc.
This is a backup from sql2005 to sql 2008r2 sp1, and for other databases in different countries doing the same exercise from their sql2005 database works fine, its just 1 country. All SQL2008 are the same version, sp1.
The execution plan differs like this
Quick plan (left-to-right): CPU 15, Reads 40, Writes 4, Duration 31
Clustered Index Seek 7%
Compute Scalar 0%
Compute Scalar 0%
Clustered Index Insert (tempdb) 93%
Between clustered index seek and computer scalar, thin arrow saying 0 rows
Between the 2 compute scalar there is a wide arrow, saying 700,000 rows
Then between the 2nd compute scalar and the Clusted Index Insert there is a think arrow saying 0 rows
Slow plan: CPU 29,999, Reads 3,846,583, Writes 158,571, Duration 30,007
Clustered Index Seek 7%
Compute Scalar 0%
Segment 0%
Sequence Project (Computer Scalar) 0%
Clustered Index Insert (tempdb) 93%
For the Showplan XML
Between clustered index seek and computer scalar, thin arrow saying 0 rows
Between the compute scalar and segment there is a wide arrow, saying 700,000 rows
Between segment and sequence project a thin arrow with 0 rows
Between sequence project and the clustered index insert (tempdb) a thin arrow saying 0 rows
For the Showplan XML Statistics Profile
Between clustered index seek and computer scalar, wide arrow saying 700,000 rows
Between the compute scalar and segment there is a wide arrow, saying 700,000 rows
Between segment and sequence project a wide arrow with 700,000 rows
Between sequence project and the clustered index insert (tempdb) a wide arrow saying 700,000 rows
Both the good/bad plan use the same index so its not completely a statistics problem, the difference is from 2 compute scalar operators to 1 compute scalar, 1 segment and 1 sequence project operator.
So by reloading data in a small way then running the queries this seems to have the effect of resetting the execution plan.
I've even copied the table as it to another database on another server (of same schema) and it caused the same problem, so it must be something to do with data content, but I thought statistics was just to do with which index to use.
November 8, 2011 at 10:10 am
Please save the plans as .sqlplan and upload here. You can't possibly retype everything we need.
November 8, 2011 at 10:38 am
It seems to be a bad estimate forcing the ORDER BY to spill to tempdb. Agree we need the actual plans.
Are you ABSOLUTELY CERTAIN you are passing the EXACT SAME values the FIRST TIME on both servers? If not, then parameter sniffing is likely the cause.
I suppose it could also be a different memory/CPU configuration between the two machines that is changing SORT memory and leading to the spill too. Seems like a stretch for that though.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 8, 2011 at 10:45 am
Thanks for all the replies.
This is from Navision, so is out of my control, from a SQL perspective.
All I can say is that on SQL2005 it works fine for all countries, then all countries have their specific DB backed up, copied and then restored on to SQL2008R2 SP1 (DB left in 2005 mode) and all are ok apart from 1 country, so their table/content is different, which gives this new plan and only gives correct plan if I empty the table then repopulate with a small set first then run navision then populate with the rest of the data. I've actually replaced just the table on a working copy to ths copy from this country and it produced the problem, so I would first think indexes as the bad one has a few extra of them, but the plan says the clustered/primary one, and statistics only affect the index choice don't they?
But I have update statistics tableA FULLSCAN anyway and rebuild all the indexes.
I shall try to get and upload the .sqlplan files both for the good plan and for the bad plan.
November 8, 2011 at 12:56 pm
sotn (11/8/2011)
Thanks for all the replies.This is from Navision, so is out of my control, from a SQL perspective.
All I can say is that on SQL2005 it works fine for all countries, then all countries have their specific DB backed up, copied and then restored on to SQL2008R2 SP1 (DB left in 2005 mode) and all are ok apart from 1 country, so their table/content is different, which gives this new plan and only gives correct plan if I empty the table then repopulate with a small set first then run navision then populate with the rest of the data. I've actually replaced just the table on a working copy to ths copy from this country and it produced the problem, so I would first think indexes as the bad one has a few extra of them, but the plan says the clustered/primary one, and statistics only affect the index choice don't they?
But I have update statistics tableA FULLSCAN anyway and rebuild all the indexes.
I shall try to get and upload the .sqlplan files both for the good plan and for the bad plan.
Statistics are used for MUCH more than index choice!! And when variables are in play you can get a plan in the cache that is great for first value but suck @ss for all the rest of the calls. You can also wind up with things such as implicit_conversions causing scans instead of seeks as well as bad estimates leading to all kinds of unfortunate things. IF the bad query is always called with the same (or statistically similar) value you definitely can consider a plan guide that you create from the server that makes an efficient plan for this country database.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
November 9, 2011 at 3:09 am
Thanks for the extra info.
The wierd thing from my viewpoint is that If I just back/restore from sql2005 to sql2008r2, then the problem exists, even if I restart SQL rebuild with drop_existing all indexes and update statistics with fullscan, but then emtpying and repopulateing a small section fixes the problem, then a suebsequent recreate index/stats/restart SQL means it is ok, so it most likely is a statistics issue, just not one I've seen before as copying just the item table to another server reproduces the problem even on a server that was working fine. Regarding the query it is identical every time, and from Navision so that code has not changed in some time.
The actual query is this
declare @p1 int
set @p1=180150113
declare @p3 int
set @p3=16
declare @p4 int
set @p4=1
declare @p5 int
set @p5=6
exec sp_cursoropen @p1 output,N'SELECT *,DATALENGTH("Picture") FROM "DB1"."dbo"."Item" WHERE "No_">=@P1 ORDER BY "No_" ',@p3 output,@p4 output,@p5 output,N'@P1 varchar(20)',''
select @p1, @p3, @p4, @p5
I know this is not a 'good' query, but this is the way Navision works and for all other countries it works fine and we cannot change the query.
I hope this helps.
November 9, 2011 at 6:04 am
sotn (11/8/2011)
so their table/content is different
What about the collation of the character based columns of that table? COLLATION can make a HUGE impact on performance and, IIRC, they made some changes in some of the COLLATIONs between 2005 and 2008.
Also, are you running the DB in the 2005 compatibility mode or have you changed it to 2008?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 9, 2011 at 6:10 am
The db is left in 2005 mode
I have tried 2008 mode but it made no difference.
The collation has not changed, it is Latin1_General_CI_AS
On other servers exactly the same setting, there are no problems.
The only difference to this table are a few more rows, some different records and some index differences, but the primary key is the same, and this is what is chosen on the plan, so it must be statistics related, but I have performed an update statistics with full scan on the table but that did not make any difference.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply