By: Eli Leiba | Comments | Related: > TSQL
Problem
You need a way to store records that are connected together. Let's say we have five records A, B, C, D, E. We want to store this data in a SQL Server database to show that A, B and E are connected and C and D are connected. In this tip we will go over the code to accomplish this task.
Solution
For this solution, we will be storing integer pairs that are connected to each other. The connectivity relation needs to be reflexive, symmetric and transitive meaning:
- Reflexive - For any object p it is connected to itself.
- Symmetric - If object p is connected to object q then q is connected to p.
- Transitive - If object p is connected to object q and object q is connected to object r then object p is connected to object r.
The algorithm has three major purposes:
- To initialize the network structure.
- To build the network of connected devices using a UNION operation.
- To check if a record is connected to another record using the isConnected query operation.
- To find all connected nodes based on a node.
- To disconnect a node from the network of nodes.
The solution I found for this problem is to create a table (QFT) with an ID and Value columns that represent a general array of integers. For the initialization operation, we use a stored procedure (QFInit) that fills the table with N rows from row (1,1) through (N,N) this represents a network of N unconnected devices, so all records will be connected to themselves.
For the isConnected query, we use a function (dbo.QFIsConnected) that returns a bit value 0/1. If 0, the records are not connected to the other record and if 1 the records are connected. It will check the values for both p and q and if it is the same value then the devices are connected and if not they are not connected.
For the Union operation we use another stored procedure (QFConnect). The procedure finds the values corresponding to cells p and q and updates all the values in the table where the value equals the cell p value to the value of cell q.
Here is the T-SQL script of the stored procedures and function that I have used to solve this problem.
-- First create an empty table that will serve as the storage of the algorithm: CREATE TABLE QFT (id int , v int) go -- This procedure just creates a table of rows for the test CREATE PROCEDURE QFInit (@N int) as begin declare @i int set @i = 1 while @i <= @N BEGIN insert into QFT (id,v) values (@i,@i); set @i=@i+1; END end go -- This check to see if two nodes are connected: CREATE FUNCTION dbo.QFIsConnected ( @p INT ,@q INT ) RETURNS BIT AS BEGIN DECLARE @a INT DECLARE @b INT DECLARE @r BIT SELECT @a = v FROM QFT WHERE id = @p; SELECT @b = v FROM QFT WHERE id = @q; IF (@a = @b) SET @r = 1 ELSE SET @r = 0 RETURN @r END GO -- This procedure connects two nodes together CREATE PROCEDURE QFConnect ( @p INT ,@q INT ) AS BEGIN DECLARE @a INT DECLARE @b INT SELECT @a = v FROM QFT WHERE id = @p; SELECT @b = v FROM QFT WHERE id = @q; UPDATE QFT SET v = @b WHERE v = @a END GO -- This procedure disconnects a node from all other nodes CREATE PROC [dbo].[QFDisconnect] (@node INT) AS BEGIN DECLARE @maxnode INT DECLARE @secondmax INT SELECT @maxnode = max(v) FROM QFT WHERE id = @node SELECT @secondmax = max(id) FROM QFT WHERE id < @maxnode AND v = @maxnode IF (@node = @maxnode) BEGIN UPDATE QFT SET v = @secondmax WHERE v = @node; END UPDATE QFT SET v = @node WHERE id = @node END GO -- this function returns a list of all nodes that are connected to a specific node CREATE FUNCTION dbo.QFListNodesPath (@node INT) RETURNS TABLE AS RETURN ( SELECT t.id FROM QFT t WHERE t.v = ( SELECT v FROM QFT s WHERE s.id = @node ) ) GO
Example Usage
Using the above code, we will implement a network of ten components with connectivity between nodes as shown below:
1---4---5---8 2---3---6 7---9---10
From the above, we can see that:
- 1, 4, 5 and 8 are connected
- 2, 3 and 6 are connected
- and 7, 9 and 10 are connected
Below we will create a starting point of 10 records and each record will point to itself at first.
-- Step 1 : Initialize the structure with a table with 10 rows exec QFInit 10; go
Next we will connect the rows together as follows.
-- Step 2 : build the structure using a series of calls to the QFConnect procedure -- connect 1, 4, 5 and 8 exec QFConnect 1,4 exec QFConnect 4,5 exec QFConnect 5,8 -- connect 2,3 and 6 exec QFConnect 2,3 exec QFConnect 3,6 -- connect 7, 9 and 10 exec QFConnect 7,9 exec QFConnect 9,10
Now we can see which records are connected to each other after running the above.
-- Step 3 : check table details select * from QFT;
This is what is returned:
id value === ===== 1 8 2 6 3 6 4 8 5 8 6 6 7 10 8 8 9 10 10 10
We can also use the following to see if specific nodes are connected together. We just need to pass in two node IDs.
-- Step 4 : check connections using the scalar dbo.QFIsConnected function for several values: select dbo.QFIsConnected (1,4) as conn_1_4, dbo.QFIsConnected (1,8) as conn_1_8, dbo.QFIsConnected (1,9) as conn_1_9, dbo.QFIsConnected (4,2) as conn_4_2, dbo.QFIsConnected (3,6) as conn_3_6, dbo.QFIsConnected (3,7) as conn_3_7, dbo.QFIsConnected (7,10) as conn_7_10
The result is as follows:
conn_1_4 conn_1_8 conn_1-9 conn_4_2 conn_3_6 conn_3_7 conn_7_10 -------- -------- -------- -------- -------- -------- --------- 1 1 0 0 1 0 1
If the value is 1 then the nodes are connected, if 0 they are not connected.
- 1 is connected to 4
- 1 is connected to 8
- 1 is not connected to 9
- 4 is not connected to 2
- 3 is connected to 6
- 3 is not connected to 7
- 7 is connected to 10
We can also use this function to find all nodes that are connected. So if we want to find out which nodes are connected to 1 we can run the following:
-- this will list nodes connected to 1 SELECT * FROM dbo.QFListNodesPath(1)
This returns the following:
id -- 1 4 5 8
If we want to disconnect a node we can run the following. This will disconnect the node from all other nodes.
-- this will disconnect node 1 from all other nodes exec QFDisconnect 1
After running this we can check the nodes for node 1 again.
-- this will list nodes connected to 1 SELECT * FROM dbo.QFListNodesPath(1)
This returns this result:
id -- 1
If we check for node 4 connections:
-- this will list nodes connected to 4 SELECT * FROM dbo.QFListNodesPath(4)
The above returns these results:
id -- 4 5 8
Next Steps
- Compile and execute the procedures and functions on your SQL Server database environment if your needs require to store data using this approach.
- The time complexity of the QF algorithm for N operation is considered to be bad O(N*N) for n union operations. In order to improve it, you can index the id column of the table and use the index in your operations. That will make the time complexity of the operation for N operations O(N*LOG(N)) when you use this code.
- The code was tested on Microsoft SQL Server 2014 and 2016.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips