By: Alejandro Cobar | Updated: 2023-05-26 | Comments | Related: > Referential Integrity
Problem
If your database application(s) uses multiple databases within the same SQL Server instance, there is a high chance that there's data in one database related to data in another. If you've ever been in a scenario like this, you probably know by now that you cannot create a foreign key in one database table referencing a table in a different database. Because of this, there's a potential risk that you'll end up with orphan records. Even if developers assure you that's not the case because the backend ensures that no orphan records will be left in the database, I'd say, "Trust in what you can control." We must develop a solution that helps us in such situations.
Solution
This article will explore a solution to help maintain referential integrity between different databases. This is aimed toward new data coming into the database or updating existing data.
Initial Considerations
- This article covers databases within a SQL Managed Instance, a SQL Server instance in a VM, or an on-prem deployment. Therefore, Azure SQL databases do not apply to this solution.
- For demonstration purposes and simplicity, I will use a pair of databases called "parent" and "children," a table on each side, and some sample records.
- I will try to keep the solution very simple so that you can get the overall idea and expand/scale upon it.
Setting Up the Sample Scenario
First, let's create two databases called "parent" and "children".
USE master GO CREATE DATABASE [parent] CREATE DATABASE [children] GO
Then, let's create a table in both databases and add dummy data.
USE [parent] GO CREATE TABLE [dbo].[parent_table]( [id] [int] NOT NULL, [description] [varchar](16) NOT NULL, PRIMARY KEY CLUSTERED ([id] ASC) GO INSERT INTO [dbo].[parent_table] VALUES (1,'configuration 1'),(2,'configuration 2'),(3,'configuration 3') GO USE [children] GO CREATE TABLE [dbo].[child_table]( [id] [int] NOT NULL, [configuration_id] [int] NOT NULL, PRIMARY KEY CLUSTERED ([id] ASC) GO INSERT INTO [dbo].[child_table] VALUES (1,1),(2,2),(3,3) GO
For the sake of science, let's attempt to create a Foreign Key from child_table to parent_table.
The image above shows an error message that cross-database foreign key references are not supported, so we are on the right track.
Forcing Referential Integrity for Incoming Data
To achieve this, let's use an INSTEAD OF trigger structure in the child_table. An INSTEAD OF trigger is a trigger that allows us to interrupt/bypass a regular DML operation with some logic defined by us.
Here's the code for the trigger (you can download the code at end of this article):
CREATE OR ALTER TRIGGER trg_fk_children ON child_table INSTEAD OF INSERT,UPDATE AS BEGIN SET NOCOUNT ON; CREATE TABLE #tmp_records( [id] INT NOT NULL, [configuration_id] INT NOT NULL, [parent_exist] BIT NOT NULL ) INSERT INTO #tmp_records SELECT inserted.[id], inserted.[configuration_id], CASE WHEN EXISTS (SELECT 1 FROM parent.dbo.parent_table WHERE id = inserted.configuration_id) THEN 1 ELSE 0 END FROM INSERTED inserted --If an UPDATE statement was issued IF ((SELECT COUNT(*) FROM DELETED) > 0 ) BEGIN UPDATE child_table SET child_table.[id] = inserted.[id], child_table.[configuration_id] = inserted.[configuration_id] FROM child_table child_table JOIN #tmp_records inserted ON child_table.[id] = inserted.[id] WHERE inserted.[parent_exist] = 1 END ELSE --This means that the operation is an INSERT BEGIN INSERT INTO child_table([id],[configuration_id]) SELECT [id], [configuration_id] FROM #tmp_records WHERE [parent_exist] = 1 END --Return the invalid records, if there's at least 1 IF((SELECT COUNT(*) FROM #tmp_records WHERE [parent_exist] = 0) > 0) BEGIN DECLARE @errorMessage VARCHAR(512) = (SELECT CONCAT('The following elements have unresolved references to parent.dbo.parent_table:',CHAR(13),CHAR(10))) DECLARE @children NVARCHAR(100) = (SELECT STRING_AGG(configuration_id,',') FROM #tmp_records WHERE [parent_exist] = 0) IF LEN(@children) > 0 SET @errorMessage += CONCAT('Parent ID:',@children,CHAR(13),CHAR(10)) RAISERROR(@errorMessage,16,1) END RETURN END
Let me explain why the trigger is crafted like this:
- I'm creating a temp table (#tmp_records) because our solution must be able to work with batches of DML operations at once. If you code your solution thinking that you will always deal only with single records, you will potentially end up with weird/inconsistent things in your data (precisely what we are trying to address).
- In the temp table, I'm adding an extra field called "parent_exist," which we will use to validate whether that ID exists in the parent table. If your table has multiple fields that need to be checked, add one "_exist" field per each.
- Populate the temp table with whatever data the DML operation contains and update the parent_exist field of each record accordingly.
- The core logic:
If the intended operation is an UPDATE, then update all the records with parent
IDs in the parent table. Otherwise, insert the new records with parent IDs in
the parent table.
- Quick pause here:
I'm using the same trigger for an INSERT or UPDATE operation in this
example. Therefore, to distinguish if the intent of the operation is one
or the other, we do this:
- Usually, triggers use the internal INSERTED and DELETED tables to store information related to the table you are interacting with.
- If the DELETED table is not empty, then SQL Server will store the current data of the targeted record for modification and will store in the INSERTED table the new data.
- Quick pause here:
I'm using the same trigger for an INSERT or UPDATE operation in this
example. Therefore, to distinguish if the intent of the operation is one
or the other, we do this:
- Lastly, if there are records in the temp table with a parent_exist value of 0, then prepare an error message that will be returned to the client. This article written by Jared Westover, SQL RAISERROR to Better Handle Errors, covers the RAISERROR function in more detail.
Let's look at a couple of examples of DML operations against our sample tables to see the trigger in action.
First, here's the data currently in the parent and child tables:
Now let's try to insert a couple of records into the children.dbo.child_table:
The trigger worked as expected because the configuration_ids 4 and 5 do not exist in the parent table. Not only that, but the trigger also gave us the list of parent Ids that have unresolved references to the parent table, separated by a comma.
Now let's try an UPDATE operation to attempt to assign a new invalid configuration_id to an existing record:
The UPDATE record was perfectly intercepted as well.
To wrap up this demonstration, let's insert a couple of valid records to see if the trigger allows them to pass through:
Success, both DML operations worked as expected!
Insights of this Approach
- In large batches of DML operations, the performance might suffer a penalty due to the logic implemented in the trigger, so carefully study and benchmark your use cases.
- You must be very careful of the applied logic in the trigger to check if the parent record exists, or you can end up with inaccuracies in your data.
- If the structure of your table is constantly changing (adding/removing fields), you might have a hard time maintaining this solution because for each new or removed field, you need to update the trigger accordingly, or it will fail. For situations like this, you probably want to eliminate the burden and instead implement a job or other automation solution to notify you when orphan data is detected within your tables. I will be covering this specific solution in part 2 of this series.
Next Steps
- In part 2, I will show the trigger for the parent side of the equation because we need to react somehow if a parent record suffers a modification or a deletion.
- In addition to the trigger at the parent side, I will show an alternate solution in case you still need some cross-database referential integrity but do not want to live with the cons of this technique. This solution involves developing a control point to notify you when orphan records are detected in the specified tables. Stay tuned!
- Download the code for this article.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-05-26