By: Mohammed Moinudheen | Comments (6) | Related: > Log Shipping
Problem
I was woken up at 3:00 AM due to a pager alert on my phone regarding a SQL Server log shipping failure of a very large critical production database. On investigation, I noticed that the log shipping chain was broken and the restore job was failing as it was unable to find a missing transaction log backup. Is there any quick way by which I can re-sync log shipping again without having to reinitialize log shipping using a full SQL Server database backup?
Solution
This tip assumes that you already have a log shipping configuration in your environment. If you are new to log shipping, you may refer this tip which has the detailed steps to set up log shipping.
In this tip we will see how to fix a log shipping failure by performing a restore using a differential backup instead of performing a full database backup and then restoring it on the secondary. This method saves us considerable time especially if the database is very big. We will also see in this demo where this method does not work, that is, scenarios where you may not be able to just restore using a differential backup.
We will be using two SQL Server instances for this demo. The database is called VLDB with log shipping set up successfully as shown.
- Primary SQL instance: BOULCOTT\INST1
- Secondary SQL instance: BOULCOTT\INST2
These are three jobs used for log shipping:
- LSBackup_VLDB - creates the log backups on the primary
- LSCopy_BOULCOTT\INST1_VLDB - copies the log backups to the secondary
- LSRestore_BOULCOTT\INST1_VLDB - restores the log backups on the secondary
From the screenshot above, we can see that log shipping is working fine for the ‘VLDB’ database and all log shipping jobs – Log Backup, Copy and Restore are running as expected.
SQL Server Database Backup Schedule
When dealing with very large databases, full database backups are usually performed on the weekends to avoid peak database usage during the week and differential backups are done every other day of the week.
Let's assume that in our scenario, we have the backup schedule as below for our VLDB database with log shipping.
Backup Job | Schedule |
---|---|
Full database backup | Sunday 11:00 pm |
Differential backup | Monday-Saturday 11:00 pm |
Transaction log backups | Using Log Shipping job every 15 minutes |
Break SQL Server Log Shipping for Testing
Refer to these steps on how you can break log shipping. We will use these steps to break log shipping, so that you can simulate this tip using certain assumptions based on the time the log shipping breaks.
Perform a transaction log backup of the primary database as shown and then delete it from the backup folder.
BACKUP LOG VLDB TO DISK = 'C:\Temp\LogShip\VLDB_1.trn' WITH INIT, STATS, COMPRESSION
Once done, run the following jobs in sequence.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
When you run the restore job you will get this error message, because of the deleted log backup file.
When you right click on the SQL instance and view the log shipping status report, you will see that the log shipping report status is still looking good. This is because the restore threshold is set to 45 minutes as shown below.
You can modify this threshold value to a lower or higher value. In this demo, we will reduce it to 5 minutes, so the log shipping failures alerts occur quickly. You can use this stored procedure. It needs to be run on the master database on the secondary server.
EXEC master.dbo.sp_change_log_shipping_secondary_database @secondary_database = 'VLDB', @restore_threshold = 5
Once this is run, the restore threshold gets reduced to 5 minutes as shown below.
Using these steps we can break a log shipping configuration, so we can work on fixing it.
Scenario 1: Log shipping breaks Tuesday at 10PM
You are alerted that there is a log shipping failure on a production database. From our earlier assumption, the following jobs should have completed successfully on the primary server.
Backup Job | Schedule | Status |
---|---|---|
Full database backup | Sunday 11:00 pm | Completed Successfully |
Differential backup | Monday 11:00 pm | Completed Successfully |
Differential backup | Tuesday 11:00 pm | Completed Successfully |
On investigation, you find that the job shipping failure appears to have occurred on Tuesday around 10 PM.
Let's simulate this scenario
First, perform a full backup of the VLDB database on the primary server using this script.
BACKUP DATABASE VLDB TO DISK ='C:\Temp\LogShip\VLDB_1.BAK' WITH INIT, STATS, COMPRESSION
Run the following jobs in sequence.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
Next, perform a differential backup of the VLDB database on the primary server using this script.
BACKUP DATABASE VLDB TO DISK ='C:\Temp\LogShip\VLDB_1.DIFF' WITH DIFFERENTIAL
Run the following jobs in sequence.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
Ensure that the log shipping jobs between the primary and secondary server are working as normal by referring to the log shipping status report as shown below.
Break the log shipping configuration by referring to the steps described earlier in the tip. Check the log shipping status report again to ensure that the log shipping configuration has alerted a failure. The alerting will display based on the restore threshold that is set. In our case, the restore threshold is set to 5 minutes.
Finally, perform a differential backup of the VLDB database on the primary server using this script.
BACKUP DATABASE VLDB TO DISK ='C:\Temp\LogShip\VLDB_2.DIFF' WITH DIFFERENTIAL
Steps to fix this log shipping failure
From the above scenario, you can see that a differential backup was performed after the log shipping chain was broken. You can use this differential backup to resync the log shipping configuration again. Copy the second differential backup to the secondary instance and restore the backup using the below command.
RESTORE DATABASE VLDB VLDBB FROM DISK='C:\Temp\LogShip\VLDB_2.DIFF' WITH NORECOVERY, STATS
Once done, run the following log shipping jobs in sequence to resync the secondary database with the primary.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
The restore job will complete as shown below.
By using this method, you were able to successfully resync the log shipping secondary database quickly using a differential backup instead of using a full database backup. If you have a very large database, you would have saved a considerable amount of time using this method versus copying a large backup file over the network which may have caused performance issues and also you may not have met client SLAs.
Note: The size of the differential backup depends on when it was taken. If the differential backup had occurred on a Friday evening, it may be quite large compared to one taken Monday evening as it would have all the changes that occurred since the last full backup. Refer to this tip for more understanding on differential backups.
How was a differential backup able to resync log shipping?
You can collect some useful information from the system tables in the msdb database that stores information about the backup files. You can first run this query on the msdb database on the primary instance.
SELECT ELECT database_name ,type = ( CASE type when 'D' then 'FULL DB BACKUP' when 'I' then 'DIFFERENTIAL Backup' when 'L' then 'Log Backup' END) ,checkpoint_lsn ,database_backup_lsn ,differential_base_lsn ,backup_start_date FROM msdb.dbo.backupset WHERE backup_start_date>'2018-09-02 11:12:00.000' -- Optional, you can give a suitable time
The output on the primary server is as shown below. You can see that the checkpoint_lsn of the full database backup matches the database_backup_lsn of the differential backup.
Run the same query on the secondary server msdb database. The output below is from the secondary server.
You can see that the database_backup_lsn remains the same on both the primary and secondary servers as it is based on the log sequence number of the latest full database backup. Using this script, you can identify the last differential backup that is available on the primary server and then copy it across to the secondary and restore it on secondary. That is what we did to resync the log shipping configuration based on our scenario.
Bear in mind that the last full database backup had occurred much before the log shipping failure and additional transaction log backups had been applied to the secondary server which helps in maintaining the database_backup_lsn on the secondary server too. Once the latest differential backup was applied to the secondary server the following jobs were run in sequence to ensure that log shipping configuration is running as expected.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
Scenario 2: Log shipping breaks Sunday at 10PM
You are alerted that log shipping failed. On investigation, you find that the log shipping failed at around 10 PM on Sunday. As per our assumptions, the following steps occurred.
Name of Task | Schedule | Status |
---|---|---|
Log Shipping failed | Sunday 10:00 pm | LS-Restore job is failing |
Full database backup | Sunday 11:00 pm | Completed Successfully |
Log backups are occurring | Every 15 minutes | Not getting applied to secondary |
Let's simulate this scenario
Run the log shipping jobs in sequence to confirm that the log shipping configuration is running as expected.
Job Name | Server |
---|---|
LSBackup_VLDB | Primary |
LSCopy_BOULCOTT\INST1_VLDB | Secondary |
LSRestore_BOULCOTT\INST1_VLDB | Secondary |
Break the log shipping configuration by referring to the steps described earlier in the tip.
Check the log shipping status report again to ensure that the log shipping configuration has alerted of a failure. You can see that the log shipping configuration has failed with an error.
Create a full database backup using this script.
BACKUP DATABASE VLDB TO DISK ='C:\Temp\LogShip\VLDB_1.BAK' WITH INIT, STATS, COMPRESSION
The log backups on the primary are continuing to run as expected every 15 minutes. With these steps you have simulated the scenario 2 situation.
Steps to fix this log shipping failure
Based on your experience with scenario 1, you attempt similar steps to resync log shipping that you performed in scenario 1. You notice that a differential backup of the primary database was not performed after log shipping failed. So, you perform a differential backup on the primary using the below script.
BACKUP DATABASE VLDB TO DISK ='C:\Temp\LogShip\VLDB_1.DIFF' WITH DIFFERENTIAL
Once done, copy this differential backup to the secondary server and restore it on the secondary using the below script.
RESTORE DATABASE VLDBSE VLDB FROM DISK='C:\Temp\LogShip\VLDB_1.DIFF' WITH NORECOVERY, STATS
You get this error message when you attempt to restore the differential backup on the secondary.
With this step, you can confirm that you are unable to resync log shipping by just restoring a differential backup from the primary. In this case, you will have to reestablish log shipping by performing a full backup on the primary, copy to the secondary and then restoring it on the secondary.
Why was I unable to use a differential backup to resync log shipping?
You can refer to the earlier script that was provided to collect information from the msdb system tables and run it on both the primary and secondary servers.
SELECT database_name ,type = ( CASE type when 'D' then 'FULL DB BACKUP' when 'I' then 'DIFFERENTIAL Backup' when 'L' then 'Log Backup' END) ,checkpoint_lsn ,database_backup_lsn ,differential_base_lsn ,backup_start_date FROM msdb.dbo.backupset WHERE backup_start_date>'2018-09-02 11:12:00.000' -- Optional, you can give a suitable time
Below is the screenshot from the primary server. Notice the change in the database_backup_lsn after the full database backup (Rows 40-43).
Below is the screenshot from the secondary server. You can see that the database_backup_lsn is the same as the one on the primary server before the full database backup was performed. As log shipping is broken, the new database_backup_lsn value of “659000002719400037” is not getting applied to the secondary server. This is preventing you from applying the differential backup from the primary server on to the secondary.
Steps to fix this log shipping failure
You will need to initialize log shipping by restoring the full database backup from the primary to the secondary.
Once you re-initialize log shipping by performing a full backup of the primary database, copying the backup to the secondary and then restoring the secondary database, log shipping will get re-synced again. You can query the backup set table on the msdb database again by using the script provided and you can review the database_backup_lsn again.
On the primary server, the screenshot is as shown.
On the secondary server, after the database is restored, the screenshot is below. You can see that the database_backup_lsn values get synced as shown.
Summary
In this tip, you were able to see the possible ways by which you could resync log shipping using a differential backup and also by using a full database backup.
Next Steps
- In this tip, in the first scenario, you saw in detail how you could resync log shipping by using a differential backup
- In the second scenario, you saw the issues you may face when you attempt to resync log shipping using a differential backup
- In this tip, you also learned what happens internally when you attempt both scenarios
- Try this tip on your own test environments
- Reuse the scripts from this tip on your own test servers
- Refer to numerous log shipping articles
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips