March 16, 2009 at 12:40 pm
I recently was alerted to some time outs occurring in our main application. I started a trace and discovered a stored proc that was causing the problem. None of the indexes on the related tables had fragmentation above 3%, so I started looking at statistics.
I updated stats on one table at a time as follows:
UPDATE STATISTICS myTableName WITH ALL, FULLSCAN
After each update, I would try to reproduce the time out error. It was after updating stats on the 4th table that the time out error went away. I theorized that a large number of records were involved in some DML operations sometime after stats had last been updated for the table. Here is what I found:
Total number of records in table: 1.487 million
DateTime of last stats update before the time outs: 03/16/2008 6:00am
Number of records inserted/updated since last stats update: 176
Number of records deleted since last stats update: 0
So...DML operations on about one tenth of a percent of the table caused the statistics to be out of date. Is this a fair assessment? If statistics are out of date and causing poor performance, how would I know? Right now, the only indication is the influx of phone calls from the end users.
Ideally, I'd like to have a sql job run periodically throughout the day and update stats on an as-needed basis.
March 16, 2009 at 1:07 pm
A possible sign that your statistics are out of date is if the actual and estimated row counts in the execution plans are very different.
I expect that you know, but in case you don't, there is an Auto Update statistics option for each of your databases which should normally be set to True.
David
March 16, 2009 at 1:51 pm
David (3/16/2009)
A possible sign that your statistics are out of date is if the actual and estimated row counts in the execution plans are very different.I expect that you know, but in case you don't, there is an Auto Update statistics option for each of your databases which should normally be set to True.
David
Comparing row counts between actual and estimated row counts is fine when I'm sitting at my desk using SSMS. I need a solution that can be automated, though...
Yes, Auto Update Statistics is enabled/true for the database. (It done asynchronously on the db.)
March 16, 2009 at 3:55 pm
We've found that SQL2k5 seems to be more sensitive to variations in statistics, in that bad query plans (ie. slow) are more likely to be used than when the statistics are "fresh".
The system in question has very skewed data (first element in clustered index has 2,000 different values but 60% of rows have one value and over 99% have one of 4 values) and fields in all indices, including the clustered index, are updated relatively frequently. In the largest tables (there are around a dozen tables of 20-400 million rows) the number of rows increases by only ~0.5% per month and ~0.2% of rows in the clustered indices are updated, but if the stats are not updated weekly then by the third or fourth week the performance starts to degrade significantly (NB. they're vendor supplied indices that we're contractually not allowed to change).
The update stats job that should already be in your maintenance plan is critical to the performance of your applications, possibly more so than the rebuild indices job, and if you're not running it at least weekly then I'd make that change as soon as practical (more often is probably not necessary, depending on the system).
March 16, 2009 at 5:46 pm
Do not put Autoupdate stats ON, it does affect your database performance. I would prefer running a job to update statistics now and then after analyzing the data.
March 17, 2009 at 10:18 am
The update stats job that should already be in your maintenance plan is critical to the performance of your applications, possibly more so than the rebuild indices job
This has always been my experience as well!
* Noel
March 17, 2009 at 11:19 am
The update stats job that should already be in your maintenance plan is critical to the performance of your applications, possibly more so than the rebuild indices job, and if you're not running it at least weekly then I'd make that change as soon as practical (more often is probably not necessary, depending on the system).
I already have a sql job that updates stats and rebuilds indexes once per day. No worries there.
What I really need is to be able to programmatically determine which particular statistics are out of date. Then I can automate the process of updating stats as needed throughout the day.
In my production environment, updating stats is sometimes needed more than once per day. Obviously, I could schedule my sql job to run multiple times throughout the day. But I don't want to take a performance hit for unnecessary rebuilds.
March 17, 2009 at 12:09 pm
Krishna (3/16/2009)
Do not put Autoupdate stats ON, it does affect your database performance. I would prefer running a job to update statistics now and then after analyzing the data.
AFAIK, disabling AutoUpdate of Statistics is bad advice left over from the SQl 6.5/7.0 days;
updated statistics are critical, and allowing AutpUdate=true actually have a minimal impact in 2000/2005.
statistics get updated when 20 percent of the rows in the table have been inserted/updated...20% in a million row table takes a long time to get to, and that's why running statistics more often on big tables than autoupdate would do is often beneficial.
Lowell
March 17, 2009 at 3:40 pm
Hi Dave,
Can you tell me how do you get the information of data inserted/updated/deleted since last update statistics...is there any DMV or table to look out..
March 18, 2009 at 6:43 am
Joseph (3/17/2009)
Hi Dave,Can you tell me how do you get the information of data inserted/updated/deleted since last update statistics...is there any DMV or table to look out..
Hi Joseph,
•Deletes are easy: there are none. (Each table has a [Deleted] BIT column).
•Each table has a LastChangedDate DATETIME column (it defaults to GETDATE()). Every time the table is updated, LastChangedDate is updated to GETDATE(). In addition, each table has an update trigger that inserts the entirety of the logical deleted table into a corresponding history table. For example, table dbo.MyAppTable. It has an update trigger that inserts into dbo.MyAppTableHistory. Collectively, these mechanisms allow me to determine the number of inserts/updates since the last stats update.
March 18, 2009 at 7:25 am
ok i put together a script i thought would identify statistics that might need to be updated...
but when i run the results of the script, a few of the tables remain in the next pass of the script, usually becuase the # rows modified is the smae as the # total rows.
results look something like this:
[font="Courier New"]Db SN TABLE Name Mod Tot %Mod STATISTICS SQL
KHC dbo CENSUSTRACTS 65693 65693 100.00 UPDATE STATISTICS dbo.CENSUSTRACTS --100% No real Stats Rule
KHC dbo CMENTITY 2 1740 0.11
KHC dbo CMENTITY 3 1740 0.17
KHC dbo Foreign_Key_Table 1 1 100.00 UPDATE STATISTICS dbo.Foreign_Key_Table --20% Small Table Rule
KHC dbo GMAACCP_OLD 29 10113 0.29
KHC dbo GMAACCP_OLD 58 10113 0.57
KHC dbo GMACTATTRIBUTES 4427 4292 100.00 UPDATE STATISTICS dbo.GMACTATTRIBUTES --100% No real Stats Rule
KHC dbo GMACTMEASURES_OLD 10636 5318 100.00 UPDATE STATISTICS dbo.GMACTMEASURES_OLD --100% No real Stats Rule[/font]
here's my script as it stands:
--The Analysis: my crappy assumptions:
--tables under 1000 rows, I'll leave then at 20%
--tables with more rows than that, I'll use an arbitrary sliding scale formula.
--formula to be modified based on analysis
SELECT X.*,
ISNULL(CASE
WHEN X.[Total Rows]<=1000
THEN
CASE
WHEN [Percent Modified] >=20.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' --20% Small Table Rule'
END
WHEN [Percent Modified] =100.00
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' --100% No real Stats Rule'
--WHEN X.[Rows Modified] > 1000
--THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' --1000 Rows Modified Rule'
ELSE
CASE
WHEN X.[Total Rows] > 1000000000 --billion rows
THEN CASE
WHEN [Percent Modified] > 0.1
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 1B Big Table Rule'
END
WHEN X.[Total Rows] > 100000000 --hundred million rows
THEN CASE
WHEN [Percent Modified] > 1.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 100M Big Table Rule'
END
WHEN X.[Total Rows] > 10000000 --ten million rows
THEN CASE
WHEN [Percent Modified] > 2.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 10M Big Table Rule'
END
WHEN X.[Total Rows] > 1000000 --million rows
THEN CASE
WHEN [Percent Modified] > 5.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 1M Big Table Rule'
END
WHEN X.[Total Rows] > 100000 --hundred thousand rows
THEN CASE
WHEN [Percent Modified] > 10.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 100K Big Table Rule'
END
WHEN X.[Total Rows] > 10000 --ten thousand rows
THEN CASE
WHEN [Percent Modified] > 20.0
THEN 'UPDATE STATISTICS ' + [Schema Name] + '.' + [Table Name] + ' -- 10K Big Table Rule'
END
END
END,'') AS [Statistics SQL]
FROM (
SELECT DISTINCT
DB_NAME() AS [Database],
S.name AS [Schema Name],
T.name AS [Table Name],
I.rowmodctr AS [Rows Modified],
P.rows AS [Total Rows],
CASE
WHEN I.rowmodctr > P.rows
THEN 100
ELSE CONVERT(decimal(8,2),((I.rowmodctr * 1.0) / P.rows * 1.) * 100.0)
END AS [Percent Modified]
FROM
sys.partitions P
INNER JOIN sys.tables T ON P.object_Id = T.object_id
INNER JOIN sys.schemas S ON T.schema_id = S.schema_id
INNER JOIN sysindexes I ON P.object_id = I.id
WHERE P.index_id in (0,1)
AND I.rowmodctr > 0
) X
Lowell
March 18, 2009 at 8:38 am
I think we can get how many inserted/deleted/updates together from the column 'rowmodctr' of sys.sysindexes tables..
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply