SQL Server Data Mining with Office 2007 to Predict Database Growth Part 1

By:   |   Comments (2)   |   Related: More > Database Administration


Problem

As part of your DBA duties, you should be analyzing the size of your database so you can prepare for growth.  This may be for new purchases of disk sub-systems, consolidation, pre-allocating enough space for your database to minimize fragmentation or maybe to maximize performance.  With all of the necessary tasks that need to be done on a daily basis very few DBAs have the time to be proactive.  In this tip we will look at how you can use some of the SQL Server data mining tools to help you predict database growth.

Solution

Previously, I used the Trend functions in Excel to predict growth patterns and needs. This works OK, but it required extra work in Excel to prepare and show the data.  I found a better and easier solution to predict database growth by using the Data Mining Add-Ins for Office 2007 and the results are great.

Pre-requisite

Before you get started here are a few things that you will need to have in place to make this work in your environment.

(1) Collect and Load Data

Now that you have the components installed we need to prepare some test data. 

You can either use this tip to gather data Collecting SQL Server database usage information for free space and trending and follow the steps below or you can use this sample Excel Workbook that you can download.

Manually Gather Data

After you have collected the data you can run a query in SSMS to pull the data.  For this example I am only using TotalSizeMB and InsertDate.  If you right click in the upper left cell it will select all data.

grid cells

Then right click in this same cell and you will see the option "Copy with Headers" if you are using SQL 2008 SSMS. If you are not use SQL 2008 then you will need to pull in the column headers with the data.

copy with headers

Then open Excel 2007 and paste this data so you have a sheet like the following.

excel total size

(2) Format

Now that you have the data, the first thing that you need to do is to format the data using the "Format as Table" option. Click any data range (I usually just click "A1") and click the "Format as Table" tool button in the "Home" ribbon as shown below.

excel format table

 It should automatically preselect the range of data like below. If not, you will need to do this manually so all data is selected.

format as table

You can choose any format option when you do this .  I just used the first format option.  It doesn't really matter what you use.

(3) Setup Forecast

After the above step is done you will see a new menu item "Analyze" as shown below.  Under this menu you will see  an option "Forecast" which is also highlighted below.

excel forecast

Click on Forecast and follow these steps.

  1. Choose "TotalSizeMB" for the columns you want to forecast
  2. Change the "Number of time units to forecast:" to days that you want to see. I choose 120 and that means it will predict the database size for next 120 days.
  3. Choose "Time stamp" column for your input for time which should be "InsertDate"
  4. Optional - choose "Periodicity of data". I get better results by choosing to detect automatically. If your data is absolutely clean you can choose one of the other options. But for this example I will let Excel choose.
forecast total size

(4) Run Forecast

At this point click "Run" to start the forecast and you will see the prediction for 120 days as shown below.

forecasting report

As you may have noticed there are some missing date labels. So, there is one more step that you can do to clean this up.

(5) Touch Up

In order to fill in the missing labels, click the "Sheet1" sheet (Original Sheet) and go to the very end of the data (cell B689 cell in the Sample Excel worksheet). This is the new data that was predicted.

To fill in the missing dates select from cell B687 to the end of the missing data in column B and click the "fill" button and choose "Series" as shown below.

sample size excel

Click the "Trend" check box and click "OK"

trend click

Excel will then fill in the missing dates based on the other data as shown.

insert date

(6) Final Report

You can change the Title and other cosmetic things if you want, but basically that is it. You can move your mouse on the graph line to see the actual numbers for presentation purposes or go back to the original data sheet to get the numbers. Enjoy!

total size MB
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Kun Lee Kun Lee is a database administrator and his areas of interest are database administration, architecture, data modeling and development.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, June 7, 2012 - 5:55:26 PM - james liu Back To Top (17862)

I am using a stand-alone pc with all the data on my pc and no connection to a database, what do I need from the above 2nd pre-requisite: SQL Server Analysis Service 2008 + Data Mining Add-Ins for Office 2007 (SQL 2008)  or  SQL Server Analysis Service 2005 + Data Mining Add-Ins for Office 2007 (SQL 2005)?


 


Thursday, April 15, 2010 - 4:11:39 PM - luis lozano Back To Top (5250)
Good afternoon. I want to do this same process but in vb.net, that statistic could Use?, in advance thank you very much for your cooperation.














get free sql tips
agree to terms