I have a database which has several tables that have very heavy write operations. These table are very large and some are over a hundred gigabytes. I noticed performance of this database is getting slower and after some investigation we suspect that the Auto Update Statistics function is causing a performance degradation.
Statistics are objects which contain information about the distribution of values in one or more columns of a table or indexed view. The query optimizer uses this statistical data to estimate the number of rows in the query result. In the case where the statistics are invalid or outdated then the query optimizer is likely to choose a wrong execution plan and query performance will decrease. Therefore keeping statistics up to date is an essential part of SQL Server maintenance.
If statistics updates are such a vital part of SQL Server then how can they cause performance issues? In contrast to the above mentioned benefits, updating the statistics also takes resources from the system. This can lead to performance issues when SQL Server is under heavy load especially when you are doing bulk data inserts or have continuous pressure from insert and update operations. In this case statistics are constantly updated keeping your server busy. You can use SQL Profiler to investigate whether the statistics updates are causing a slowdown or not. You can capture the "Auto Stats" event to see the overhead on your system.
If you experience problems due to statistics automatically being updated then the solution might be to disable the Auto Update Statistics option. However, we cannot live with outdated statistics on large databases, so if you turn this option off you should create a scheduled job which explicitly updates the statistics during off-peak periods. To do this, you can use the sp_updatestats procedure.
Following are the steps to disable the "Auto Update Statistics" and to put in place a scheduled job to maintain statistics.
Disable Auto Update Statistics
Open Microsoft SQL Server Management Studio and navigate to your database using Object Explorer. Right click on the database and choose Properties. Select "Options" in the new window and set "Auto Update Statistics" to False as shown on the following screen.
Alternatively you can use the following script to disable this option:
ALTER DATABASE YourDBName
SET AUTO_UPDATE_STATISTICS OFF
Create New SQL Job
Open Microsoft SQL Server Management Studio and navigate to SQL Server Agent -> Jobs using Object Explorer. Right click Jobs and choose New Job.
Enter the job name, owner, category (Database maintenance) and description on the following screen:
Click on Steps in the left side pane. Fill in the job name and database. The command type is T-SQL script. You should enter the following command: EXEC sp_updatestats
On the advanced tab you can choose success/failure actions. I recommend you log the job output to a file. You should enter a run as user, but please note that only the DBO and sysadmins can execute this procedure.
The last task is to create a schedule to run the job in off-peak periods. It depends on your database usage on how often you should update the statistics: daily, twice a day, etc. It really depends on your database size, the number of changing rows, etc... I recommend you experiment with this to find the optimal solution for your database. First schedule the job to run once a day in a convenient time and also measure how long it runs. Check the performance gain and consider whether more frequent updates are required.
There is also an option to update the statistics for only one table or a specified index. For this purpose you can use the UPDATE STATISTICS command. Please note that this command sets the Auto Update Statistics to ON if you do not use the NORECOMPUTE option.
UPDATE STATISTICS YourTableName(YourIndex) WITH NORECOMPUTE
In addition, you can turn off Auto Update Statistics for a specific table using the sp_autostats system stored procedure.
I did not understand, why Automatic Statistics Update is turned off.
If the job is executed every day auto update statistics will never be activated unless in special situations. And then it will - maybe - be better to auto update statistics than execute on old statistics information.
So run the scheduled job for update statistics - but let Auto Update Statistics be set to true
Monday, February 07, 2011 - 3:09:32 PM - Tibor Nagy
I understand the most of the DBAs are sure that switching off auto stats update does not make any sense BUT my experience shows that in the described scenario it works. The heavy load means that certain statistics are getting updated e.g. 20 minutes after the previous update finished. This means that the statistics are continously updated, putting additional load on the server. I recommend you to make some experiment with the explained solution in such cases.
I do not say that this tip is always useful, especially when the statistics are getting auto updated only once or twice a day.
I wrote that 'These table are very large and some are over a hundred gigabytes.' and 'This can lead to performance issues when SQL Server is under heavy load especially when you are doing bulk data inserts or have continuous pressure from insert and update operations.'
In my everyday practice we load files larger than 100 MB and a few GB data is inserted during online operations every day.
My experience: Almost all the queries were running slow and the overall performance of the server was poor. We saw "Auto Stats" event all the time.
After swithing off Auto Update Statistics and running manual updates we observed a performance improvement of appr. 10-30% for the most complex queries (those one running for at least 10 minutes) and an overal speed-up of the server.
Later someone mistakenly enabled Auto Update Statistics and the performance was down again (the scheduled update job was still enabled). We switched it off again and the server performance was back to normal.
I would say that large OLTP environments can be potentially impacted by this issue and a DWH with less frequent data changes might not benefit from such change.
May I ask, in my very beginning knowledges, SQL Server creates clustered indexes for all primary keys if you do not change this. Clustered indexes keep the actual data (registers) also in the same order. In very large databases, where insertion of not ordered rows will force the database engine to work on existing data to reorder everything, or even with high level of updates, causing the same overwork on reordering data, this kind of indexes couldn't cause significative delays?
Friday, February 25, 2011 - 9:01:21 AM - Keith Howells
The author is absolutely right. The automatic statistics update uses a sampling rate that is way too small, and you cannot override this sampling rate. On a large table the sampling rate is often well less than 1%. This in turn causes the query optimizer to make bad decisions - and you only need a few such bad decisions to cripple database performance. I've spoken to Microsoft about this and they refuse to acknowledge there's a problem; so the only solution is to run statistics manually with FULLSCAN,NORECOMPUTE on a regular basis. We've found every 2 weeks is fine.
This is very late in the game but I can better answer the very first comment. Auto update stats and sp_updatestats are exlcusive. If you do an sp_helptext on sp_updateststats, you will see it is using the rowmod column in sysindexes to find candidates. When auto update triggers on a stat, it will reset this column. So if he is running both will at times trigger auto update before his scheduled maintenance and get very ugly stats for a prolonged period.
I like the sp_updatestats procedure except that you cannot control the sample size of the stats you are creating. I run a modified version of it that allows for this.
The problem I had with shutting down update stats for the whole database had to do with process that truncate tables and load them again mid day. At that point even smal sample size stats are better than nothing. To solve this I run scripts during maimtenance routines that look for stats with poor sample sizes and update them. So rather than using no recompute, I allow auto update to run normally. Then I scan for stats with poor sample size each night and fix them.