September 5, 2010 at 6:25 am
I have a SQL 2005 database with tables that gets wiped every night (some tables TRUNCATE'd, others DELETE with no WHERE).
Then several of the tables are (for the most part) populated by INSERT INTO table SELECT columns FROM linked_server_table. Others are built using logic in stored procs.
After the data is loaded, the tables are read-only, used by reports, for lookups by apps, etc.
From a performance point of view, am I better off, or is it best practice to:
1) Leave "auto update statistics" on and let SQL update the stats sometime after the data is loaded
2) Turn "auto update statistics" off and run UPDATE STATISTICS ... WITH FULLSCAN on all the tables after the load
3) Both 1 & 2 (leave "auto update stats" on AND do UPDATE STATISTICS on all the tables after the load
September 6, 2010 at 4:19 am
I don't know how big your tables are, but unless you have asynchronous updates of stats turned on then is is possible that you will encounter delays and possibly timeouts when the tables are queried for the first time (and the stats consequently auto-updated) after the data manipulation. I would therefore recommend manual stats updates in this scenario.
Turn the auto stats update off and perform an UPDATE STATISTICS...WITH FULLSCAN after the load, assuming that you have time in your maintenance window to perform a full stats update. Leave the auto creation of stats turned on.
When making this kind of configuration change to a database, you may find it helpful to add an Extended Property to the database documenting the reason for the change. This way anyone querying the decision in your absence (and with permission to view the database's Extended Properties) can see the justification for the decision.
Chris
September 6, 2010 at 4:44 am
Chris Howarth-536003 (9/6/2010)
Turn the auto stats update off and perform an UPDATE STATISTICS...WITH FULLSCAN after the load, assuming that you have time in your maintenance window to perform a full stats update. Leave the auto creation of stats turned on.Chris
agree with chris ,but look for heavy tables ,you can go for partial statistics update for those tables.
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
September 6, 2010 at 8:15 am
For almost all ETL scenarios where you wipe the entire table or affect a reasonable fraction of rows it is best to disable auto update/auto create, do the load, then manually refresh stats. FULLSCAN obviously gets you the best stats, but may not be possible on larger tables within the load window.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
September 6, 2010 at 11:04 am
As these settings are at the database level you should turn them on again after the load if you have other tables outside of the ETL process.
Only true way to tell if it helps is to try it. you could run a server trace alongside it checking for update stats events.
---------------------------------------------------------------------
September 6, 2010 at 6:03 pm
mannaggia (9/5/2010)
I have a SQL 2005 database with tables that gets wiped every night (some tables TRUNCATE'd, others DELETE with no WHERE).Then several of the tables are (for the most part) populated by INSERT INTO table SELECT columns FROM linked_server_table. Others are built using logic in stored procs.
After the data is loaded, the tables are read-only, used by reports, for lookups by apps, etc.
From a performance point of view, am I better off, or is it best practice to:
1) Leave "auto update statistics" on and let SQL update the stats sometime after the data is loaded
2) Turn "auto update statistics" off and run UPDATE STATISTICS ... WITH FULLSCAN on all the tables after the load
3) Both 1 & 2 (leave "auto update stats" on AND do UPDATE STATISTICS on all the tables after the load
Most of the time ETL processes are somehow consistent in terms of data volume and data distribution.
If this is the case I'll probably try option #4 which is ...
a) turn auto update statistics off at table level.
b) do nothing else
The rationale behind this alternative is: Statistics are there to help query optimizer to generate a good execution plan so, assuming current execution plans are good just leave the old stats there.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply