update statistics during production ?

Question

Post reply

update statistics during production ?

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply

tony28 SSCertifiable Points: 7114 More actions · Answer 1

So I needed update statistics again with FULL SCAN.. but only on TABLE1,TABLE2

now result is ok....

So Do I have update statistics with full scan ? now I have simple

anybody knows why it was happen?

every time was used good index, but what I saw, problem was in Execution plan in Seek Predicates :

When I had wrong result it was 1PKCOLUMN+2PKCOLUMN

When I have good result it was only 1PKCOLUMN

so I will try check before/after with DBCC SHOW_STATISTICS and SET STATISTICS PROFILE, hope that will find something.. because now I dont understand this situation

I am sorry for my english, if something is not clear I will try explain again. Thanks

tony28 SSCertifiable Points: 7114 More actions · Answer 2

So I tested again..

I executed query before updated stats :

Select * from TABLE1 A

LEFT JOIN TABLE2 B

ON

A.1PKCOLUMN=B.1PKCOLUMN

AND A.2PKCOLUMN=B.2PKCOLUMN

WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

OPTION(RECOMPILE)

SET STATISTICS PROFILE showed wrong ex.plan with second condition in where

So I used this for update statistics

EXECUTE msdb.dbo.IndexOptimize

@Databases = 'databases',

@FragmentationLow = NULL,

@FragmentationMedium = NULL,

@FragmentationHigh = NULL,

@UpdateStatistics = 'ALL',

@OnlyModifiedStatistics='Y',

@LogToTable='Y'

In log of this command I found

- PK_TABLE1 and another stats was updated for TABLE1

- PK_TABLE2 and another stats was updated for TABLE2

I again executed query:

Select * from TABLE1 A

LEFT JOIN TABLE2 B

ON

A.1PKCOLUMN=B.1PKCOLUMN

AND A.2PKCOLUMN=B.2PKCOLUMN

WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

OPTION(RECOMPILE)

SET STATISTICS PROFILE showed wrong ex.plan with second condition in where

Check with DBCC SHOWSTATISTICS where I found that in one of these table is not today value( today date )

Another tables from database has RANGE_HI_KEY 20160229 and EQ_ROWS 174,107 or something like this, but table in query which I executed before is looked like this statistics are LOCK or something.

I tried again execute command for update statistics what is above, but wasnt helpful...

After UPDATE STATISTICS TABLE1 WITH FULLSCAN this RANGE_HI_KEY 20160229 was appeared and it is OK.

What is symptom of this? It means that If I execute query before update but in condition with new data, which arent statistics will be lock or will not update well?

Can anybode explain me?

Thanks

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 3

Sorry, been on the road.

First, you have OPTION RECOMPILE on your query. Whether your update your statistics or not, you're going to get a recompile every time that query runs. Next, you need to look at your statistics to understand why the optimizer is making the choices it makes. Yes, the exec plan will tell you what those choices are, but you need to take it's row estimates to your statistics to understand where they came from. That will tell you what you need to know. It sounds like a sampled statistics update was not adequate to show the changes in your data. Instead, you needed the FULL SCAN. That's OK as long as the added load and time it takes doesn't negatively impact the system.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 4

In SQL Server 2012 and earlier, if the value falled outside the range of statistics, the optimizer assumed that the number of values returned was 1 row. This changed in SQL Server 2014 with the updated estimator. It now assumes an average of the rows in the range. Either assumption could cause problems with the row estimates and the execution plan. Updating the statistics with a FULL SCAN resulted in a more accurate row estimate and therefore a better execution plan.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

tony28 SSCertifiable Points: 7114 More actions · Answer 5

Grant Fritchey (2/29/2016)
Sorry, been on the road.
First, you have OPTION RECOMPILE on your query. Whether your update your statistics or not, you're going to get a recompile every time that query runs. Next, you need to look at your statistics to understand why the optimizer is making the choices it makes. Yes, the exec plan will tell you what those choices are, but you need to take it's row estimates to your statistics to understand where they came from. That will tell you what you need to know. It sounds like a sampled statistics update was not adequate to show the changes in your data. Instead, you needed the FULL SCAN. That's OK as long as the added load and time it takes doesn't negatively impact the system.

for RECOMPILE, I know that should be works like you write, but this behaviour I dont understand so much.

I found that this problem with choosing wrong ex.plan is due to missing new row in statistics of one of these tables from tested query. On next 50 or more tables was all without problem. But these another I tested only after, not before.

It means, I have for example 15tables where is for example one column different, I work for auto industry, so for example one table Cocpit, second Engine, third FrontChassis and etc. This tables are Header, so it means this is still same , it means you have Automat 2.0 diesel or something. ... another tables has also different data(specification of these tables) but here is same... In this time when I updated stats, in each of these tables were for example 20rows.... and after update in all statistics were these 20rows except one what I tested.. oh very hard with my english, hope that you understand

But its look like same lock with open session or something like this(not meaning lock wait, just example like lock ) only for one of these tables and cannot update ..

I will test tomorrow again, because it looks like this Sample is ok on all tables in database except one which I tested in query.

So I will not test before, only after and will see what I will get..

Thanks

tony28 SSCertifiable Points: 7114 More actions · Answer 6

Grant Fritchey (2/29/2016)
In SQL Server 2012 and earlier, if the value falled outside the range of statistics, the optimizer assumed that the number of values returned was 1 row. This changed in SQL Server 2014 with the updated estimator. It now assumes an average of the rows in the range. Either assumption could cause problems with the row estimates and the execution plan. Updating the statistics with a FULL SCAN resulted in a more accurate row estimate and therefore a better execution plan.

thanks for good note, I will see tomorrow testing and depends on this will do some action.

I have 2012 now.

This is not some big problem what I write here, because after optimizing design of query, time and io of this procedures were for example from 300ms to 3ms and from xxx xxx logical reads to xxx.. now we have 172GB memory and everytime was full, after few months optimizing is about 100GB, when somebody select big data, but never know in future and why have useless logical reads ...

so I would like to better understand and I am curious from nature :]

Thanks

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 7

tony28 (2/26/2016)
WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

So why not write this as

WHERE A.1PKCOLUMN='DATE'

AND A.2PKCOLUMN > '+SEQ'

That would make it far easier for SQL Server to estimate the cardinality, and it does exactly the same.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

tony28 SSCertifiable Points: 7114 More actions · Answer 8

Hugo Kornelis (2/29/2016)
tony28 (2/26/2016)
WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'
So why not write this as
WHERE A.1PKCOLUMN='DATE'
AND A.2PKCOLUMN > '+SEQ'
That would make it far easier for SQL Server to estimate the cardinality, and it does exactly the same.

Thanks for note.

My mistake.. In lot of times I use

WHERE A.1PKCOLUMN >= 'DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'

this is due to change data after night...it means i have 201602299999 and then 201603010001 .... but still I need in one table.

but it doesnt matter why stats has this behaviour ..

tony28 SSCertifiable Points: 7114 More actions · Answer 9

So today I checked and it looks like Grant Fritchey wrote that Sample statistics dont do on all tables what I want...Interesting is that these tables have same count of rows.

I will thinking if I will do only sample more times and little will be helpful or will try FULLSCAN, not sure if sometimes can have effect for production.. so will test

Thanks

tony28 SSCertifiable Points: 7114 More actions · Answer 10

Test was without problem,,

update statistics for real database took about 1hour and there was some little performance increasing, so I think it can be, but has to be managed that during some lot of works has to be stopped..

Like you can see on picture... normaly we are about 20-35%, during this 1hour is about 50%...

So what is your opinion ? One per day do FULLSCAN ? Or keep this sample, which will help sometimes...

So now I think depends on my decision what will be better,,, sometimes pressure for CPU or sometimes scan all tables .. :]

Thanks all guys for discussion and advices.

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 11

There is no right answer. It's a set of trade-offs. Sacrifice some CPU and I/O to update the statistics and rebuild any execution plans versus have execution plans be out of date or off because of incorrect row estimates. You have to balance and test and adjust to arrive at the right answer.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 12

tony28 (2/29/2016)
Hugo Kornelis (2/29/2016)
tony28 (2/26/2016)
WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'
So why not write this as
WHERE A.1PKCOLUMN='DATE'
AND A.2PKCOLUMN > '+SEQ'
That would make it far easier for SQL Server to estimate the cardinality, and it does exactly the same.
Thanks for note.
My mistake.. In lot of times I use
WHERE A.1PKCOLUMN >= 'DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'
this is due to change data after night...it means i have 201602299999 and then 201603010001 .... but still I need in one table.
but it doesnt matter why stats has this behaviour ..

I'm a bit confused by the sudden change from fixed strings 'DATE' and '+SEQ' to some large numbers. So not sure if the rest of what I'm going to write makes any sense in the context of your actual data model and actual query.

Anyway, here is another alternative that might be easier for SQL Server to estimate. (And please stress the word "might"):

WHERE A.1PKCOLUMN >= 'DATE'

AND NOT (A.1PKCOLUMN = DATE AND A.2PKCOLUMN < '_SEQ'

You should also consider the basic table design. If 1PKCOLUMN and 2PKCOLUMN are often combined in this way, then you have probably split up a single attribute over multiple columns; undo that by combining them as a single column and your query becomes a lot easier: WHERE A.NewColumn >= 'DATE+SEQ'

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

WHug Ten Centuries Points: 1248 More actions · Answer 13

What are the row counts of the affected tables? If the tables are large, and the rows changing are under certain threshold (20%?) then stats may not auto-update. Trace Flag 2371 may help.

Review

https://support.microsoft.com/en-us/kb/2754171

and

https://technet.microsoft.com/en-us/library/jj553546(v=sql.110).aspx

tony28 SSCertifiable Points: 7114 More actions · Answer 14

Hugo Kornelis (3/1/2016)
tony28 (2/29/2016)
Hugo Kornelis (2/29/2016)
tony28 (2/26/2016)
WHERE A.1PKCOLUMN='DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'
So why not write this as
WHERE A.1PKCOLUMN='DATE'
AND A.2PKCOLUMN > '+SEQ'
That would make it far easier for SQL Server to estimate the cardinality, and it does exactly the same.
Thanks for note.
My mistake.. In lot of times I use
WHERE A.1PKCOLUMN >= 'DATE' AND A.1PKCOLUMN+A.2PKCOLUMN>'DATE+SEQ'
this is due to change data after night...it means i have 201602299999 and then 201603010001 .... but still I need in one table.
but it doesnt matter why stats has this behaviour ..
I'm a bit confused by the sudden change from fixed strings 'DATE' and '+SEQ' to some large numbers. So not sure if the rest of what I'm going to write makes any sense in the context of your actual data model and actual query.
Anyway, here is another alternative that might be easier for SQL Server to estimate. (And please stress the word "might"):
WHERE A.1PKCOLUMN >= 'DATE'
AND NOT (A.1PKCOLUMN = DATE AND A.2PKCOLUMN < '_SEQ'
You should also consider the basic table design. If 1PKCOLUMN and 2PKCOLUMN are often combined in this way, then you have probably split up a single attribute over multiple columns; undo that by combining them as a single column and your query becomes a lot easier: WHERE A.NewColumn >= 'DATE+SEQ'

WHERE A.1PKCOLUMN>='1PKCOLUMN' AND A.1PKCOLUMN+A.2PKCOLUMN > '1PKCOLUMN'+'2PKCOLUMN'

First I think it was my mistake that I wrote "NameOfColumn" instead of right values... so DATE is char(8) like 20160302 and seq is char(4) like from 0001 till 9999...

Like you now, sometimes is very hard to change design of the tables if lot of applications select something, and you cannot modify applications or the design created someone else...

This is reason why I have 1PKCOLUMN in where alone, and then combination..before than I started work here it was only combination and this result was very, but very poor. And almost every time scan all tables, now is only if statistics are not updated...

Now second day I am trying do Update Statistics With FULLSCAN and it looks like little helpful, but it was increased some CXPacket and some wait with CPU...

But I think that with current status of our server, which is overated it shouldnt have some effect for production database...

Thanks

tony28 SSCertifiable Points: 7114 More actions · Answer 15

WHug (3/1/2016)
What are the row counts of the affected tables? If the tables are large, and the rows changing are under certain threshold (20%?) then stats may not auto-update. Trace Flag 2371 may help.
Review
https://support.microsoft.com/en-us/kb/2754171
and
https://technet.microsoft.com/en-us/library/jj553546(v=sql.110).aspx

Hello,

I am not talking about auto-update, here is not possible, because like you wrote .... I am talking about manual statistics update with SAMPLE parameter. And row counts are same for these tables, and some was updated and some not,,,

Thanks