April 26, 2017 at 1:32 am
We are currently experiencing some strange phenomenons, and I am hoping that someone may be able to shed some light on this.
System: 4CPU x10 processors, 512GB memory, NetApp IO subsystem
SQL2012 SP3 CU7 Std Ed. - 16 core, 128GB
SQL instance used in ETL process.
Change: Add luns to DB Server
Move larger tables to filegroups on new luns.
Shrink DB to leave 20% free in DB files.
Post change observations (the problem)
A SSIS Package went from running in 2 minutes to 10 minutes.
The major step is a SQL query (90% of package duration) involving two tables - these were NOT moved in the change above.
Query;
insert into x1
Select from tabB
where col1 not in (select col1 from tabA)
Table info:
tabA - 500k+ rows, clustered +3 NC indices.
tabB - 300k+ rows heap - no NC indices.
Execution Plan:
scan tabB
index seek on tabA.ind2
index scan on tabA.clustered
index scan on tabA.ind4
we investigated the tables rebuild / reorganized indices and updated stats. - this caused the query duration to prolong to between 13-17 minutes
investigating the heap, i found 100k forwarded_row_count and rebuilt the heap. - query ran in 20 minutes.
Can anyone shed some light onto what may be happening here? - I am trying to understand why "std maintenance" tasks designed to improve performance is resulting in exactly the opposite
Thanks
Edit: from other internet hits on topic mentioning statistics sampling, I checked sampled rows. This turned out to be <10% of the rows in the indices. investigating further....
April 26, 2017 at 5:06 am
AnzioBake - Wednesday, April 26, 2017 1:32 AMWe are currently experiencing some strange phenomenons, and I am hoping that someone may be able to shed some light on this.
System: 4CPU x10 processors, 512GB memory, NetApp IO subsystem
SQL2012 SP3 CU7 Std Ed. - 16 core, 128GB
SQL instance used in ETL process.Change: Add luns to DB Server
Move larger tables to filegroups on new luns.
Shrink DB to leave 20% free in DB files.
Post change observations (the problem)
A SSIS Package went from running in 2 minutes to 10 minutes.
The major step is a SQL query (90% of package duration) involving two tables - these were NOT moved in the change above.
Query;
insert into x1
Select from tabB
where col1 not in (select col1 from tabA)
Table info:
tabA - 500k+ rows, clustered +3 NC indices.
tabB - 300k+ rows heap - no NC indices.
Execution Plan:
scan tabB
index seek on tabA.ind2
index scan on tabA.clustered
index scan on tabA.ind4
we investigated the tables rebuild / reorganized indices and updated stats. - this caused the query duration to prolong to between 13-17 minutes
investigating the heap, i found 100k forwarded_row_count and rebuilt the heap. - query ran in 20 minutes.
Can anyone shed some light onto what may be happening here? - I am trying to understand why "std maintenance" tasks designed to improve performance is resulting in exactly the opposite
ThanksEdit: from other internet hits on topic mentioning statistics sampling, I checked sampled rows. This turned out to be <10% of the rows in the indices. investigating further....
Can you post an execution plan for that query? It seems odd that there are three operators for Table A.
Preferably an actual plan, but an estimated plan would be useful too.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
April 26, 2017 at 5:48 am
ChrisM@Work - Wednesday, April 26, 2017 5:06 AMCan you post an execution plan for that query? It seems odd that there are three operators for Table A.
Preferably an actual plan, but an estimated plan would be useful too.
Actual Query Plan attached
April 26, 2017 at 6:22 am
We have confirmed that updating statistics with fullscan did not improve query performance
April 26, 2017 at 7:14 am
GG-173138 - Wednesday, April 26, 2017 5:48 AMChrisM@Work - Wednesday, April 26, 2017 5:06 AMCan you post an execution plan for that query? It seems odd that there are three operators for Table A.
Preferably an actual plan, but an estimated plan would be useful too.Actual Query Plan attached
Excellent, thanks. That's awesome...
In the meantime, here's a logical equivalent which I'd expect to be faster - it's certainly cleaner:SELECT ...
FROM dbo.TfmInvestorAccountTmp t
WHERE NOT EXISTS (SELECT 1 FROM dbo.TfmInvestor i WHERE i.investorCIFNumber = t.investorCIFNumber)
EDIT: Here's why...
https://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
April 26, 2017 at 8:31 am
thanks, but we already have a solution. we are just interested in an actual explainantion for future reference i.e why index maintenance resulted in this sudden change in behaviour
April 26, 2017 at 9:03 am
AnzioBake - Wednesday, April 26, 2017 1:32 AMChange: Add luns to DB Server
Move larger tables to filegroups on new luns.
Shrink DB to leave 20% free in DB files.
Post change observations (the problem)
A SSIS Package went from running in 2 minutes to 10 minutes.
The major step is a SQL query (90% of package duration) involving two tables - these were NOT moved in the change above.
Offhand I'm thinking the Shrink DB is a more likely culprit than index maintenance for the sudden change in behavior, especially since the 2 tables were not moved to the new file group, so they would have been most affected by the shrink when you tried to reclaim free space.
April 26, 2017 at 10:41 am
We came to the similar conclusion, i.e. the shrink affected the table. That is when we started looking at the table and index (statistics) statuses. Rebuild indices and the table heap just worsened performance. That is what we are struggling to understand - it did not just make it slightly worse, it doubled the execution time. I neglected to mention that the data volumes also did not increase significantly in either table.
April 26, 2017 at 11:17 am
it's understandable, to shrink a database file, SQL Server will need to rearrange all the pages that are at or near the end of the file into other space that was already free or freed when you moved the other tables to the new filegroup. When it does this, it doesn't really know the best way to rearrange those pages, it's just trying to do so as quickly as possible. Paul Randal can explain it better than I ever could though:
https://www.sqlskills.com/blogs/paul/why-you-should-not-shrink-your-data-files/
As for your first index and stats maintenance, depending on how badly fragmented your index and heap were, and the fact that starting with SQL Server 2012 it doesn't do a full statistics recalculate with the index build anymore, the statistics could have gotten skewed depending on the sample:
https://docs.microsoft.com/en-us/sql/relational-databases/indexes/reorganize-and-rebuild-indexes
April 27, 2017 at 12:21 am
Thanks Chris for the response.
As we moved 400GB of data from the source files, and it is very unlikely to be require in the near future, the shrink was required.
Subsequent to the index rebuild and test we checked the stats and ran a stats update with fullscan and still did not see any performance improvement.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply