Script to Delete SQL Server Data in Batches

By:   |   Updated: 2020-03-17   |   Comments (14)   |   Related: More > TSQL


Sometimes in large SQL Server OLTP systems, tables can grow very big over time. When that time comes, queries against such tables might begin to suffer performance issues because those tables were never designed to host that volume of data. Even with the correct indexes, the performance might not be as good as expected and you might be forced to give some extra thought purging some old data.


In this tip I’m going to present a few T-SQL scripts that can help you copy information to a historical table (in case historical information is required in the future, but not in the principal table), and set into motion a loop that can delete records in batches (specified by you) using a determined start date and end date.

During the process, you can use another T-SQL script (provided in this solution) to constantly monitor the progress of such deletion.  Perhaps you need to know how many time it takes to delete a day of information or simply how much time it takes to delete X amount of records, and the overall progress.

Code Explained

TThe process will be handled through a stored procedure, so it can fit any case you might have.

  • I will use tempdb as my chosen database to work with through the entire demonstration.
  • I have created a test table called "big_table" and have populated it with 500,000 rows distributed across 5 days.

HHere’s how you are going to call the stored procedure:

EXECUTE [dbo].[Batch_Delete]    @startDate          = '2015-01-01'
  ,@endDate            = '2015-01-06'
  ,@dbName             = 'tempdb'
  ,@schemaName         = 'dbo'
  ,@tableName          = 'big_table'
  ,@dateFieldName      = 'created_date'
  ,@saveToHistoryTable = 1
  ,@batch              = 1000

The names of the parameters are pretty much self-explanatory, but here’s their purpose:

  • @startDate: The start date from when to start working with the records.
  • @endDate: A non-inclusive date to limit when the records will be taken into account. In my example it means that I’m just going to be working with records from >= ‘2015-01-01’ and date < ‘2015-01-06’.
  • @dbName: The name of the database that hosts the table that you want to work with. Take into account that in this same database the historic and metrics tables are created.
    • IfIf you’d like to have these 2 tables in separate databases, a few tweaks would have to be made to the stored procedure and an additional parameter would have to be passed to specify the name of the other database.
  • @schemaName: The name of the schema of the table.
  • @tableName: The name of the table to be processed.
  • @d@dateFieldName: The name of the column that contains the timestamp fields to work with. Remember that this logic is written towards a date-based approach; if you want a different one you’d have to adjust the stored procedure to your particular case.
  • @saveToHistoryTable: If 1, then an identical empty table is created based on @tableName, and the name "_historic" is added to it to distinguish it. If 0, then no data movement will be performed during the execution of the script (use it very carefully).

ThThe stored procedure contains a set of initial validations to make sure that you have entered the correct information to proceed.

Here are all the validations considered:

  • If @startDate is equal or greater than @endDate then notify the user.
  • If the @dbName isn’t specified or it is left blank then notify the user.
  • If the @schemaName isn’t specified or it is left blank then notify the user.
  • If the @tableName isn’t specified or it is left blank then notify the user.
  • If the @dateFieldName isn’t specified or it is left blank then notify the user.
  • If the table targeted by @dbName.@schemaName.@tableName doesn’t exist, then notify the user.
  • If the table exists but the field @dateFieldName doesn’t, then notify the user.
  • IfIf the parameter @saveToHistoryTable is set to 0 (being 1 the default), a warning will be shown to the user.

Below are some screenshots as an example.

In this example, the table "big_table_X" does not exists, so we get an error.

A screenshot of a cell phone Description automatically generated

In this example, the column "create_date2" does not exists, so we get an error.

A screenshot of a cell phone Description automatically generated

TST-SQL Code for Stored Procedure Creation

At a high-level, this is what the stored procedure does:

  1. If the stored procedure already exists, delete the stored procedure and create it from scratch.
  2. All the set validations described above are performed initially and the execution stops if any of them fails.
  3. If all the validations pass, then create the "Delete_Metrics" table to keep track of how many times (in seconds) it takes each day to be wiped out from the main table.
    • A non-clustered index is placed on the StartDate and EndDate, in case you end up with a lot of records and want to speed up your lookups. If you think there are not that many, then you can consider removing this.
  4. If the @saveToHistoryTable is set to 1, then all the records that will be affected are copied to a newly created table that has the exact same structure as the original table (without constraints and indexes), and will have the name "_historic" added to it.
  5. After all the records have been saved to the history table, a while loop that goes from @startDate to @endDate-1 will kick in and the processing will take place inside a transaction. Why? Because if anything goes wrong at any given time then it has a chance to rollback.
  6. Per day, the deletes will take place by batches specified by the parameter @batch, so use it and test it carefully.
  7. After all the records for a given day have been deleted, then the respective record is inserted in the Delete_Metrics table, to show how many seconds it took to delete X amount of records.
  8. As a bonus, I included a piece of code that constant displays (after each batch of records is deleted) the current size of the transaction log (in MB) of the database where the activity is taking place, along with the free space in the file (in MB).

With all this information, not only can you create a baseline of delete operations for daily records, but you also can know the impact this will have on the transaction log file.

Here is the entire script.

   SELECT type_desc, type
    FROM sys.procedures WITH(NOLOCK)
    WHERE NAME = 'Batch_Delete'
            AND type = 'P'
@startDate          DATE,
@endDate            DATE,
@dbName             VARCHAR(64) = NULL,
@schemaName         VARCHAR(64) = NULL,
@tableName          VARCHAR(64) = NULL,
@dateFieldName      VARCHAR(64) = NULL,
@saveToHistoryTable BIT = 1,
@batch              INT = 1000
  DECLARE @tableExists BIT = 0
  DECLARE @fieldExists BIT = 0
  DECLARE @sqlCommand NVARCHAR(2048)
  IF(@startDate > @endDate OR @startDate = @endDate)
   RAISERROR('startDate can''t be higher or equal than endDate!!!', 16, -1)
  IF(@dbName IS NULL OR TRIM(@dbname) = '')
   RAISERROR('You must specify the source database where the table is hosted!!!', 16, -1)
  IF(@schemaName IS NULL OR TRIM(@schemaName) = '')
   RAISERROR('You must specify the schema of the table!!!', 16, -1)
  IF(@tableName IS NULL OR TRIM(@tableName) = '')
   RAISERROR('You must specify the name of the table!!!', 16, -1)
  IF(@dateFieldName IS NULL OR TRIM(@dateFieldName) = '')
   RAISERROR('You must specify the name of the column that contains the dates for the lookups!!!', 16, -1)
  SET @sqlCommand = '
  DECLARE @objectID BIGINT = 0
  SELECT @objectID = OBJECT_ID ('+CHAR(39)+'['+@dbname+'].['+@schemaName+'].['+@tableName+']'+CHAR(39)+',''U'')
  SELECT ISNULL(@objectID,-1)
  EXEC sp_executesql @sqlCommand
  SET @tableExists = (SELECT CASE [objectID] WHEN -1 THEN 0 ELSE 1 END FROM @e)
  IF(@tableExists <> 1)
   RAISERROR('The specified table can''t be located, please check and try again!!!', 16, -1)
  SET @sqlCommand = '
  SELECT @colSize = COL_LENGTH ('+CHAR(39)+'['+@dbname+'].['+@schemaName+'].['+@tableName+']'+CHAR(39)+','+CHAR(39)+@dateFieldName+CHAR(39)+')
  SELECT ISNULL(@colSize,-1)
  EXEC sp_executesql @sqlCommand
  SET @fieldExists = (SELECT CASE [size] WHEN -1 THEN 0 ELSE 1 END FROM @f)
  IF(@fieldExists = 0)
   RAISERROR('The specified field can''t be located, please check and try again!!!', 16, -1)
  IF(@saveToHistoryTable = 0)
  PRINT 'Be aware that you have invoked the execution of this SP with historical data transfer turned off!!!'
  -- Per Day logic
  DECLARE @currentDate DATE
  DECLARE @rows        INT
  DECLARE @totalRows   INT
  DECLARE @deletedRows INT
  SET @currentDate = @startDate
  SET @sqlCommand = '
  USE '+'['+@dbname+']
  IF OBJECT_ID ('Delete_Metrics','U') IS NULL
    CREATE TABLE Delete_Metrics(
      StartDate      DATE NOT NULL,
      EndDate        DATE NOT NULL,
      Records        INT NOT NULL,
      CompletionTime INT NOT NULL
    CREATE NONCLUSTERED INDEX IX_StartDate ON Delete_Metrics(StartDate)
    CREATE NONCLUSTERED INDEX IX_EndDate ON Delete_Metrics(EndDate)
  IF(@saveToHistoryTable = 1)
   DECLARE @h AS TABLE([rows] INT)
    SET @sqlCommand = '
    IF OBJECT_ID ('+CHAR(39)+'['+@dbname+'].['+@schemaName+'].['+@tableName+'_historic]'+CHAR(39)+',''U'') IS NULL
        SELECT TOP 0 * INTO ['+@dbname+'].['+@schemaName+'].['+@tableName+'_historic] FROM ['+@dbname+'].['+@schemaName+'].['+@tableName+']  
    INSERT INTO ['+@dbname+'].['+@schemaName+'].['+@tableName+'_historic]
    SELECT * FROM ['+@dbname+'].['+@schemaName+'].['+@tableName+'] WHERE ['+@dateFieldName+'] >= '+CHAR(39)+CONVERT(VARCHAR(20),@startDate)+CHAR(39)+' AND ['+@dateFieldName+'] < '+CHAR(39)+CONVERT(VARCHAR(20),@endDate)+CHAR(39)+'
    EXEC sp_executesql @sqlCommand
    SET @totalRows = (SELECT [rows] FROM @h)
   IF(@totalRows > 0)
   RAISERROR ('#Finished transferring records to historic table#', 0, 1) WITH NOWAIT
  WHILE(@currentDate < @endDate)
            DECLARE @t AS TABLE([rows] INT)
            SET @sqlCommand = '
            DECLARE @tempTotalRows INT = 0
            SELECT @tempTotalRows = COUNT(*) FROM ['+@dbName+'].['+@schemaName+'].['+@tableName+'] WHERE ['+@dateFieldName+'] >= '+CHAR(39)+CONVERT(VARCHAR(20),@currentDate)+CHAR(39)+' AND ['+@dateFieldName+'] < DATEADD(DAY,1,'+CHAR(39)+CONVERT(VARCHAR(20),@currentDate)+CHAR(39)+')
            SELECT @tempTotalRows
            INSERT INTO @t
            EXEC sp_executesql @sqlCommand
            SET @totalRows = (SELECT [rows] FROM @t)
            DELETE FROM @t
         SET @deletedRows = 0
         SET @startTime = GETDATE()
         DECLARE @d AS TABLE([rows] INT)
         WHILE @deletedRows < @totalRows 
             SET @sqlCommand = '            
            DELETE TOP ('+CONVERT(VARCHAR(16),@batch)+')
            FROM ['+@dbName+'].['+@schemaName+'].['+@tableName+'] WHERE ['+@dateFieldName+'] >= '+CHAR(39)+CONVERT(VARCHAR(20),@currentDate)+CHAR(39)+' AND ['+@dateFieldName+'] < DATEADD(DAY,1,'+CHAR(39)+CONVERT(VARCHAR(20),@currentDate)+CHAR(39)+')
            SELECT @@ROWCOUNT
                INSERT INTO @d
            EXEC sp_executesql @sqlCommand
            SET @deletedRows += (SELECT [rows] FROM @d)
            DELETE FROM @d
            SELECT l.total_size AS TotalSize,f.free_space AS FreeSpace
               SELECT CONVERT(DECIMAL(10,2),(total_log_size_in_bytes - used_log_space_in_bytes)/1024.0/1024.0) AS [free_space]  
               FROM sys.dm_db_log_space_usage
            )AS f,
               SELECT CONVERT(DECIMAL(10,2),size*8.0/1024.0) AS [total_size]
               FROM sys.database_files
               WHERE type_desc = 'LOG'
            )AS l
         IF(@deletedRows > 0)
            DECLARE @stringDate VARCHAR(10) = CONVERT(VARCHAR(10),@currentDate)
            RAISERROR('Finished deleting records for date: %s',0,1,@stringDate) WITH NOWAIT
            INSERT INTO Delete_Metrics VALUES(@currentDate, DATEADD(DAY,1,@currentDate),@deletedRows,DATEDIFF(SECOND,@startTime,GETDATE()))
         SET @currentDate = DATEADD(DAY,1,@currentDate)
      END TRY
         SELECT ERROR_MESSAGE() AS ErrorMessage;

Running the Delete Stored Procedure

Here are some screenshots of what you can expect upon a successful execution:

A screenshot of a cell phone Description automatically generated
A screenshot of a cell phone Description automatically generated
A screenshot of a social media post Description automatically generated

Check Progress of Delete Processing

  • This T-SQL code can be used to constantly check the progress of the rows deleted.
  • You can see how much time each individual day took to get deleted and the total time the entire process has consumed so far. In the very last column, you can see a column named "Progress" which is the % of records deleted that are already copied to the historic table (make sure to put the correct table name for your case).
  • Keep in mind that this query will work in instances using SQL Server 2012 and above, due to the usage of the SUM(X) OVER(Y) function.

Here is the query you can use to get information about the delete process:

   [Total Records] = SUM(Records) OVER (ORDER BY StartDate),
   [Total Time] = SUM(CompletionTime) OVER (ORDER BY StartDate),
   CONVERT(DECIMAL(10,2),(SUM(Records) OVER (ORDER BY StartDate) / (SELECT CONVERT(DECIMAL(10,2),COUNT(*)) FROM big_table_historic) ) * 100) AS 'Progress'
FROM Delete_Metrics

Here’s a screenshot of what the result set looks:

A screenshot of a cell phone Description automatically generated
Next Steps
  • YoYou should consider that if you delete a large amount of records, the respective indexes and statistics may need to be rebuilt.
  • Depending on the volume of information, you might have to schedule an eventual shrink of the files to reclaim disk space.
  • If you need to address a problem using this approach, make sure to run tests in development and test to get an idea of how long the process takes and the possible>
  • FeFeel free to customize the stored procedure in any way that you like, as long as it is useful for your particular use case.
  • Download the T-SQL code for this tip.

Get Started Now - Click here to get your free 14 day trial of SolarWinds Database Insights

get scripts

next tip button

About the author
MSSQLTips author Alejandro Cobar Alejandro Cobar is an MCP, SQL Server DBA and Developer with 10+ years of experience working in a variety of environments.

View all my tips

Article Last Updated: 2020-03-17

Comments For This Article

Thursday, February 17, 2022 - 4:03:51 AM - Mikymike Back To Top (89805)
Thank You Greg ! I simply disabled identity insert on my table !!

Wednesday, February 16, 2022 - 1:52:33 PM - Greg Robidoux Back To Top (89803)
Hi Mikymike,

Another thing you could do is to not use an identity column for the historic table column and then you can insert the same value from your regular table.


Wednesday, February 16, 2022 - 1:50:15 PM - Greg Robidoux Back To Top (89802)
Hi Mikymike,

Take a look at this article

This shows that you need to specify every column for the insert to work.


Wednesday, February 16, 2022 - 1:29:20 PM - Mikymike Back To Top (89801)

I have an error when I run the stored procedure script, the result returns me "Msg 8101, Level 16, State 1, Line 9
An explicit value for the identity column in table 'xxx.dbo.xxx_historic' can only be specified when a column list is used and IDENTITY_INSERT is ON."
I tried to put SET IDENTITY_INSERT xxxx.dbo.xxx_historic ON at the beginning and SET IDENTITY_INSERT xxxx.dbo.xxx_historic OFF at the end, but nothing to do, I have no data in the dbo.xxx_historic table.
Can you help me to resolve this error.

Best regards

Wednesday, September 9, 2020 - 8:32:06 PM - Alejandro Cobar Back To Top (86448)
Hi Abdul,

You are correct, thanks for pointing that out. The immediate fix would be for you to replace all the occurrences of TRIM(X) with LTRIM(RTRIM(X)).

I'm going to submit the respective update so that this is reflected in the code within the article.

Wednesday, September 9, 2020 - 1:55:26 AM - Abdul Majeed Back To Top (86436)
Hi .

Looks TRIM is introduced in SQL Server (starting with 2017) and I am using SQL 2016. Can yo suggest the alternative for TRIM in SQL 2016.

Msg 195, Level 15, State 10, Procedure Batch_Delete, Line 24 [Batch Start Line 8]
'TRIM' is not a recognized built-in function name.

Friday, August 21, 2020 - 1:01:11 PM - Alejandro Cobar Back To Top (86339)
Yes, it is totally feasible.

For that case, the @endDate parameter simply would have to be +6 months from the @startDate that you want to tackle (if I got your question right). Since the deletion of the records revolves around a per-day logic to do it by chunks.

Friday, August 21, 2020 - 8:39:07 AM - hello Back To Top (86338)
Hi, good script. Instead of explicitly providing start date or end date, it is possible to provide a period something like (<6 months)

please advise?

Wednesday, April 15, 2020 - 1:31:11 AM - World Teaches Back To Top (85372)

Nice post for SQL server

Wednesday, April 1, 2020 - 3:09:07 AM - Alexander Huberti Back To Top (85234)

I had some performance issues when I used DELETE TOP() statements in batches on huge tables. The main problem was increasing of logical reads over time. I prefer a solution like this:

Tuesday, March 31, 2020 - 2:08:03 PM - Alejandro Cobar Back To Top (85228)

@Anne, you're most welcome!

@Broniz, thanks a lot for that insight. Yes, a v2 of what I did here can potentially address such cases where large tables are linked together.

@Jamie Ingram, yes it would also work well. For me it is very cool to have multiple ways to accomplish the same thing.

Take care and stay safe!!!

Tuesday, March 31, 2020 - 1:23:52 PM - Anne Back To Top (85227)

Thanks for the stored procedure.

It is very helpful

Tuesday, March 31, 2020 - 4:09:31 AM - Broniz Back To Top (85224)

Thank you !

This is very useful for isolated cases, or temporary tables when you want to keep minimal data retention. I especially like the ability to see the progress. However this technique can't work for critical systems where you get several large tables linked together. You need to segment the deletes to stay within a reasonable journal size (log file will explode otherwise), and make sure all records for the linked tables are processed at the same time. You may also need to drop the indexes that are not useful for better performances. We're talking here for tables with dozens of millions of records of course.

Wednesday, March 18, 2020 - 1:52:47 AM - Jamie Ingram Back To Top (85135)
This also works well:
delete z from TableA z
inner join (Select Top 500 * from TableA)d
	on z.ID = d.ID
go 100 


get free sql tips
agree to terms