Every system has it, do not feel like you are the only one. Step up to the plate and correct it. That is large amounts of data that were once needed for the business that can now be archived. This data could be a large performance issue, because you are not able to perform maintenance on the table(s) because the time needed exceeds your maintenance window. As a DBA or developer just because you do not think the data is needed may not mean that it can just be deleted. Many industries have legal requirements to maintain the data. On the other side of the coin though, just because a business person says the data is needed does not mean it has to be kept in the primary OLTP database that your entire company uses. Seek out the facts and determine which strategy is right for your organization.
When it comes to archiving data, many options are available, but some preliminary items need to be addressed. These items include:
- Research the legal requirements - Make sure you understand your industry requirements, some have no data retention requirements while others require 7 years. Make sure you understand your requirements first and then design a solution.
- Data analysis - Review the data in your application to determine the key tables and dependencies between the tables when archiving. Be sure to also capture the distribution of data on a monthly, quarterly or yearly basis to understand how much data can be archived.
- Understand the data usage - Meet with the users to understand how they use the potentially archived data and if it is archived how the applications will need to change. If you are not responsible for the application development, ensure you meet with the development team to ensure the changes are feasible.
- Provide a business case - Justification for archiving the data is essential when taking on this project. These items are typically attractive to business and IT management:
- Automate the data archiving process - Setup rules with the users to be able to automate the archive process so it occurs regularly on a weekly, monthly or quarterly basis
Once you understand how the data can be archived, now the technical decisions need to be made. Depending on the answers to the questions above, one or a combination of the following approaches can be taken:
- If the data needs to reside in the same database:
- Move (insert and delete) the archive data to separate table(s) with a prefix to denote the tables are archive related such as 'arc_'
- Create the table(s) in separate filegroups on separate disks to improve IO performance
- Use a view to join the old and new data if the users continue to need to access the data
- Modify the front end code to use the view rather than the base table
- If the data does not need to reside in the same database:
- Move (insert and delete) the archive data to separate table(s) in another database on the server or to a separate database on another server
- Have users request access to the data for specific queries or change the application to use a linked server to access the archived data
- If the data is just not needed:
- Back up the data, verify the backup is restorable, note the date of the last backup with the archived data and just delete the data
- If the data is needed due to some unforeseen reason by the users, restore the noted backups to access the data
- Automation options
- Based on the application find out if the vendor has already implemented an archiving feature that just needed to be enabled.
- If an automated option is not available, build a process with DTS or SSIS to move the data from the source system to the archive system.
- If significant amounts of data are changed, be sure to perform SQL Server maintenance as a portion of the archive process.
- Evaluate your systems to determine candidates for data archiving. If you have some systems that are large in your opinion and have never been archived, consider the benefits of archiving the data.
- Be sure to test the archive process to ensure sufficient time is available to archive your data.
- If you move to an automated process, be sure to check the process after each execution or include notification messages based on execution, failure and/or success.
- If your users are unsure about the archive process, play it safe and just try to move the data to another set of tables first and then perhaps to another system.
- If you are overly cautious and your users just want to delete the data, consider moving the data to another database for 1 to 3 months before permanently deleting the data.
- Check out these related tips on MSSQLTips.com:
Last Update: 11/28/2006
About the author
View all my tips