Using DELETE CASCADE Option for Foreign Keys

By:   |   Comments (11)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | > Database Design


Problem

Referential integrity is a very important thing to consider when designing a database. In my years as a DBA I've seen database designs that sit on both ends of the spectrum, none at all and cases where every table is linked to multiple tables. While the later certainly can be a little more difficult to work with it ensures the integrity of your data stays intact. The other end provides much more flexibility when it comes to updating and deleting data from your database, whether it's being done through the application or directly on the backend, but has the issue of possible orphan records if things are not done properly. This tip will look at the DELETE CASCADE option when creating foreign key constraints and how it helps keep the referential integrity of your database intact.

Solution

Before we get into the details of the DELETE CASCADE option I wanted to mention that there is another option you can use when creating foreign keys on tables, UPDATE CASCADE. I left out any details regarding this option from this tip as this is only necessary when the columns that are part of the foreign key are updated. In my past experience I've never had to update one of these columns since foreign keys are almost always created on key columns. I am sure there are cases where this is needed, but for this tip we are just going to focus on the DELETE CASCADE option.

Sample SQL Server Table Setup

For this example we will create two sample tables that have a foreign key relationship between them. The parent table has 99,999 records in it and the corresponding child table has 19 records for each record in the parent. Here is the DDL code:

-- Table creation logic
--parent table
CREATE TABLE [dbo].[Order](
 [OrderID] [bigint] NOT NULL,
 [OrderData] [varchar](10) NOT NULL,
 CONSTRAINT [PK_Order_1] PRIMARY KEY CLUSTERED 
    ([OrderID] ASC)
)
GO
-- child table
CREATE TABLE [dbo].[OrderDetail](
 [OrderDetailID] [bigint] NOT NULL,
 [OrderID] [bigint] NULL,
 [OrderData] [varchar](10) NULL,
 CONSTRAINT [PK_OrderDetail] PRIMARY KEY CLUSTERED 
    ([OrderDetailID] ASC)
)
GO
-- foreign key constraint
ALTER TABLE [dbo].[OrderDetail]  WITH CHECK 
ADD CONSTRAINT [FK_OrderDetail_Order] FOREIGN KEY([OrderID])
REFERENCES [dbo].[Order] ([OrderID])
ON DELETE CASCADE
GO
-- data load
DECLARE @val BIGINT
DECLARE @val2 BIGINT
SELECT @val=1
WHILE @val < 100000
BEGIN  
   INSERT INTO dbo.[Order] VALUES(@val,'TEST' + CAST(@val AS VARCHAR))
   
   SELECT @val2=1
   WHILE @val2 < 20
   BEGIN  
      INSERT INTO dbo.[OrderDetail] VALUES ((@val*100000)+@val2,@val,'TEST' + CAST(@val AS VARCHAR))
      SELECT @val2=@val2+1
   END
   SELECT @val=@val+1
     
END
GO 

First Example

Now that we have some data, let's remove a record from the [Order] table. Here is the code. Note: I've added a "DBCC DROPCLEANBUFFERS" to each DML query to ensure there is no data in the cache before we run the statement.

DBCC DROPCLEANBUFFERS
GO
DELETE FROM [Order] WHERE OrderID=24433
GO

After running this statement we can query the [OrderDetail] table and verify that the records were removed. To get an idea of what needs to be done to if we did not have the DELETE CASCADE option set let's remove it and see what happens. Here is the code:

ALTER TABLE [dbo].[OrderDetail] DROP CONSTRAINT [FK_OrderDetail_Order]
GO
ALTER TABLE [dbo].[OrderDetail]  WITH CHECK 
ADD CONSTRAINT [FK_OrderDetail_Order] FOREIGN KEY([OrderID])
REFERENCES [dbo].[Order] ([OrderID])
GO

Now let's run the following code, remembering that we have to delete the records from [OrderDetail] first as DELETE CASCADE option has been removed from the foreign key. You can see how the previous example is much simpler as it only requires one DELETE statement. Imagine if we had 5 or 6 tables with foreign keys into this parent table. To remove the parent record we would have to have a separate delete statement for each of these child tables.

DBCC DROPCLEANBUFFERS
GO
DELETE FROM [OrderDetail] WHERE OrderID=24032
DELETE FROM [Order] WHERE OrderID=24032
GO

Let's also take a look a the performance differences of these two approaches. Below is a SQL Profiler output from both DELETE scenarios. You can see from this trace using the DELETE CASCADE option also uses less resources than doing the DELETE with a separate statement.

DELETE CASCADE CPU (ms) Reads Writes Duration
Yes 281 12323 2 950
No 374 24909 3 1162

Second Example

One of the SQL Server best practices I follow is to always index any foreign key columns as they are always very heavily used both in WHERE clauses and to join tables. Let's add an index to the [OrderDetail] table and run through the same scenario as in the example above. Here is the DDL code for the index:

CREATE NONCLUSTERED INDEX IX_OrderDetail_OrderID ON dbo.[OrderDetail] (OrderID)
GO

And here is the new code for the DELETE statements along with the statement to remove the DELETE CASCADE OPTION. Note: All that is changed here is the OrderIDs

DBCC DROPCLEANBUFFERS
GO
DELETE FROM [OrderDetail] WHERE OrderID=90032
DELETE FROM [Order] WHERE OrderID=90032
GO
ALTER TABLE [dbo].[OrderDetail]  WITH CHECK 
ADD CONSTRAINT [FK_OrderDetail_Order] FOREIGN KEY([OrderID])
REFERENCES [dbo].[Order] ([OrderID])
ON DELETE CASCADE
GO
DBCC DROPCLEANBUFFERS
GO
DELETE FROM [Order] WHERE OrderID=90433
GO

Looking at the results below we can see that with proper indexing we see almost no performance difference between the two methods. As I mentioned above though, using the DELETE CASCADE option does have the benefit of keeping your SQL code much simpler in that you only have to delete from the top level parent and all the child data is cleaned up automatically.

DELETE CASCADE CPU (ms) Reads Writes Duration
Yes 0 300 7 79
No 0 312 6 64
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ben Snaidero Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, April 14, 2016 - 12:42:48 PM - Peter Back To Top (41226)

While performance might be acceptable for smaller sets of data, I think that you can easily scale this up and look at scenarios where ON DELETE CASCADE is a horrible idea. I recently had to purge some data from the SSISDB database. MS decided to use ON DELETE CASCADE to handle purging data. When you only want to remove a day or so of data, it's acceptable but not good. If you want to change your parameters to affect 60 days of data in an active SSIS scenario, you can easily see the job run for hours and start filling the transaction log. I recently used a manual work around instead to delete the child rows in sets of 100k rows at a time and then deleted the parent rows. Working this in sets allowed us to control the batch size for the child rows, not fill the transaction log, and removed the rows in ~ 20 minutes with me babysitting the process a little bit. I could easily have just written the loop to do that, but chose not to in this case because it was a one-off script and I wanted to more closely monitor the results.

If MS had chosen to write a stored proc to delete the child rows in batches, then remove the parent rows, we wouldn't have seen this performance issue. Instead, deleting a set of about 1000 parent rows resulted in several child tables with 12M rows to delete. Using ON DELETE CASCADE masks what's actually going to happen and should be used carefully when dealing with potentially larger datasets. It may be easier to handle because you just issue one delete and the rest takes care of itself, but I find knowing what's going to be affected and writing stored procs to more gracefully handle those scenarios offers better visibility and even a little control over the process.


Monday, March 14, 2016 - 8:58:27 AM - Raghav Garg Back To Top (40931)

 

Sir,
You have written a great article "using DELETE CSCADE Option for Foreign Keys", i have read about it in many other places but understood the concept of foreign key only here.
Thanks for sharing your knowledge. 
Regards
Raghav Garg


Monday, March 17, 2014 - 4:42:22 PM - benjimari Back To Top (29786)

I didn't get what happened,. I hope you take this comment in a good way but the way you explain is the least effective I found, it didint even clearly tell what exactly is the thing this topic solve. Here's an example from codeproject with a good explanation of the use of the delete cascade topic "Use the ON DELETE CASCADE option if you want rows deleted in the child table when corresponding rows are deleted in the parent table. If you do not specify cascading deletes, the default behavior of the database server prevents you from deleting data in a table if other tables reference it."

 
I will try to visit this site again I hope when that happen it will not be as confusing as today. Thanks you

Thursday, August 16, 2012 - 6:52:23 AM - Rakish Magehm Back To Top (19063)

thanks for this tutorials was very helpful thank you bless you thank you so much

 

how to find the sql server management studio in the windows system files thank you so much for help on the matter

 

i have to education for my new job myself in united states

 

sincerely,

Rakash Maghaderjshiaskm Lamakashapharbaim.

 

your kindness is thank you


Tuesday, August 14, 2012 - 11:16:48 PM - Ben Snaidero Back To Top (19043)

Hi Vikas,

Sorry I missed one thing in your original question, when you say the trigger performs better do you mean you use a trigger on the master table to handle the deletes on the other tables through this?

Given the way you’ve explained your schema layout, specifically the clustered indexes on the 9 tables, I think this does produce the least amount of contention. Only suggestion I would have if you’re looking to maximize performance would be to try handling the deletes manually and not use a trigger/cascade at all. In any case though I don’t think deleting 10,000 records (whichever method you are using) would/should be very slow. Out of curiosity how long does it take? If you try doing the delete manually and it’s a lot quicker could your tempdb be on a different type of disk than the data and log files (something slower perhaps) and the eager spool is causing your slowness as the result of the spool is stored there.

And to answer your question regarding why the eager spool. I believe it’s required during the DELETE CASCADE to keep a consistent view of the table during the DELETE operation and will block until it’s complete but since it’s doing an index seek it shouldn’t have any effect on accessing other records in the table that are not being affected by the delete.

Ben


Tuesday, August 14, 2012 - 12:50:43 PM - Vikas Back To Top (19038)

Hello Ben,

There are some deletes that happen throught the day but I do have a nightly purge of 50K+ but in batches of 10K (yes minimum deleted in single Purge is 10K - not individual 10K deletes, have to delete in batches of 10K, as I don't want to generate too much volume for the database is replicated, so run in a loop and delete 10 K as a single maximum per loop).

Yes the remaining 9 tables are clustered and part of the key is originating from the Master as a result I am making use of the Cascade Delete on the Master.

Thanks again for your time.

Vikas


Tuesday, August 14, 2012 - 9:28:07 AM - Ben Snaidero Back To Top (19037)

Hi Vikas,

A couple quick questions.  When you do the delete on the master table is it a single statement that deletes the 50K+ records or 50K+ individual statements?  Also, are the indexes on the 9 tables clustered?

Ben


Tuesday, August 14, 2012 - 5:58:39 AM - Srikrishna Back To Top (19032)

Good article


Monday, August 13, 2012 - 8:09:39 PM - Vikas Back To Top (19027)

Hello Ben,

Let me expalin my case with a scenario, say we have 10+ tables and one of these is the Master table (that is it has the PK, while the remaining have FK's to this Table). I do have CASCADE delete turned on and also have a Index in place. Also lets assume the tables have a million+ rows and each of them receive new 500K rows a day.

Now in such a scenario, when I delete 50K+ rows in a day, I see an eager spool on the 9 tables along with Index Seek (on the FK's Index). As a result am not getting any gain in performance rather am seeing scenarios where the trigger performs better than the Cascade for higher volumes. Am not sure how/ the reason behind an Eager spool in the plan.

Is there anything else we can do to perform Purge operation on tables in an efficient manner? To minimize down time/ improve availability of tables (avoid blocking etc.). I am exploring the possibility of using Partitioning to do so... really not sure if it is the best way though.

Any thoughts or suggestions are welcome.

Thanks for your time,

Vikas


Wednesday, August 1, 2012 - 12:00:10 PM - Ben Snaidero Back To Top (18876)

Hi Allen,

It's not a typo.  When I mention there is almost no performance difference I mean between using delete cascade and deleting the records from each table with individual statements.  You are correct in that there is a big performance benefit with the index vs without.

Thanks for reading

Ben.


Wednesday, August 1, 2012 - 9:51:44 AM - Allen Wang Back To Top (18872)

Is it a typo? In your article said after added index on foreign key "Looking at the results below we can see that with proper indexing we see almost no performance difference between the two methods. ". But, it is a big difference between two reading, CPU, Reads, Writes and Duration all dropped down after index was being added.















get free sql tips
agree to terms