SQL Server Transaction Log Grows And Fills Up Drive

By: Ranga Narasimhan | Comments (11) | Related: More > Database Administration

Problem

I had a production problem recently, the transaction log grew 1 GB every minute and filled up the entire drive. The error message I got was: "The transaction log for database 'MyDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases. My SQL Server is clustered and the issue happened after a node failover occurred. In this tip I cover steps that I took to resolve this problem.

Solution

The message I got was the transaction log drive is full and users are unable to access the application, DBAs never want to hear this! The usual DBA question is "what was changed"?

My first priority was to fix the problem immediately, so users can access the application and then do root cause analysis. For the database in question, I changed the database recovery model to SIMPLE. Then I shrank the transaction log file to a reasonable size (note: you can't shrink it below its original size). After these changes, the log growth stopped, but this still did not tell me what the issue was. Something to note is that changing the recovery model to SIMPLE is not possible if Mirroring or Log Shipping is setup unless you remove these first, but these can be reconfigured later, the important thing is to stop the bleeding ASAP.

Other options would have been to create another transaction log file on a different drive that had plenty of space or to clean up files on the drive that was running out of space. But in my case the file was growing so rapidly that I needed to change the recovery model so the growth would stop until this issue was resolved.

Below are the steps that I took to check the size, change the recovery model and shrink the log file.

SELECT recovery_model_desc FROM sys.databases WHERE name = 'MyDB' --Before

ALTER DATABASE MyDB SET recovery simple

SELECT recovery_model_desc FROM sys.databases WHERE name = 'MyDB' --After

EXEC xp_fixeddrives --Check free drive space 

EXEC sp_helpdb MyDB -- Note the size of the log before shrink

DBCC shrinkfile(MyDB_log, 1024) -- shrink log to 1 GB

EXEC sp_helpdb MyDB -- Note the size of the log after shrink

EXEC xp_fixeddrives -- Check free drive space

First aid is complete, database is up and users are able to use the application again.

Now I need to find out what was causing the log to grow so fast, fix the issue, put the database back in FULL recovery mode, do a full backup and make sure my log backups are working correctly.

The drive now has enough space. I put the database in Full recovery mode again to see if I could catch the offending transaction and I monitored drive space and transaction log size. I immediately noticed the transaction log growing about 1 GB per minute. I then executed the block of code below to find out what may be causing the log growth.

DBCC opentran --get the spid and feed it into the next query

DECLARE @handle BINARY(20)
DECLARE @SPID INT
SET @SPID = 100 -- 100 was the spid from the dbcc opentran

SELECT @handle = sql_handle
FROM MASTER..sysprocesses
WHERE spid = @SPID
SELECT [text] FROM ::fn_get_sql(@handle)
GO

The culprit was a stored procedure that was showing up every time I executed the above block of code, although the SPID changed each time. I looked into the stored procedure which was not very complicated, but does a bunch of deletes and inserts based on some condition. This procedure was called by an automated process every minute. What happens when a cluster node fail-over happens? The cache is flushed and all the query plans are gone. Since there has not been any issue with this procedure in the past, my first option was to recompile the procedure. Viola, it worked! The transaction log stopped growing so fast and I didn't see this procedure in the open transactions from the above code. So this simple stored procedure that worked fine suddenly caused a major production issue causing the application to shut down and the fix was as simple as a recompile. Quite interesting.

My next step was to start a full database backup ASAP. While the full database backup was running I kicked off transaction log backups every 5 minutes and they seemed to work fine. I changed the transaction log backup to its normal schedule. After the full backup was completed, I reviewed the sql log, drive space, transaction log size, open transactions and everything seemed fine. If it is a cluster, you could try failing back to the other node and see if that fixes the problem, which in my case was the need for a recompile. In a cluster, the storage is shared by the nodes, so there is no benefit from a storage point of view.

After this, I researched what I could have done better in this situation. I knew that the scripts I used were from SQL Server 2000. SQL Server 2005/2008 DMVs would have helped me to find the offending transaction. Another eventful day in a DBAs life. Feels good to have resolved the issue and at the same time, I wanted to find out if I could have handled this any better. So, I armed myself with the latest tools to handle a similar situation. Adam Mechanic has a script to find active transactions here, another good script to find transaction locks is from Mladen Prajdic here.

Next Steps

Have scripts/utilities organized so you can get to the issue ASAP when needed.
Read more about DBCC SHRINKFILE
Read more about DBCC OPENTRAN

About the author

Ranga Narasimhan has been in IT for over 10 years working on Siebel, Visual Basic, Crystal Reports and SQL Server.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Monday, September 20, 2010 - 10:59:33 AM - Ranga	Back To Top (10183)
Interesting observations and good points. The procedure does the inserts/deletes based on the new data that gets added. Maybe, new data made existing plan to go out of whack! Why did I recomplied first, because, that has helped in the past. This same proc is out there at several clients and there are no complaints of performance issues. But, that is the only fix I did to get the things in order.

Sunday, September 19, 2010 - 10:44:08 AM - Steve	Back To Top (10181)
Maybe it's just me, but I fail to see the connection between recompiling a stored procedure and the growth in the transaction log.

Friday, September 17, 2010 - 5:35:16 PM - JohnF	Back To Top (10178)
I felt that was probably the case Ranga. While many dbas know better than to just change to simple without a good reason, there are also many who do not. Just wanted to mention it in case someone got the wild idea that it was just OK to do this without thinking about it first.

Friday, September 17, 2010 - 3:08:07 PM - Ranga	Back To Top (10174)
JohnF, I agree, but in this situation, the T.log grew large enough that backup didn't complete, the app was down! We were basically under water!

Friday, September 17, 2010 - 3:05:01 PM - Ranga	Back To Top (10173)
Praneeth, I understand your situation, it is a large transaction that even rollback takes a long time. May be you need a large Tlog drive. In your case how big is the DB/ Tlog file size etc ?

SQL Server Transaction Log Grows And Fills Up Drive

Problem

Solution

Next Steps

About the author

Comments For This Article