Problem
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters very similar do DBSCAN. However, OPTICS handles it more effectively in the case of a cluster with varying densities gaining deeper insights exploring the hierarchical structure of your data. This algorithm is generally more computationally intensive.
Is it possible to have the OPTICS clustering algorithm implemented in SQL Server without using an external solution?
Solution
Optics Clustering Algorithm
The OPTICS clustering algorithm is useful in geospatial data and maps, market and customer segmentation, anomaly detection, image and pattern analysis, biology, medicine, general exploratory analysis, and more. It adapts to different cluster shapes and densities, which happens a lot in real-world data.
In a dataset of points, the OPTICS algorithm walks through these points, one by one, trying to keep track of how close they are to each other, saving this number called reachability distance. This represents how easily you can reach that point from another cluster, creating an ordering of points. Those clusters will appear as a valley where low reachability means inside a cluster and high reachability means gaps between clusters. Later you can decide where to cut the valleys to define the clusters. From the ordering points you can extract clusters of different densities, without fixing the cluster size in advance.
It is necessary to supply two parameters. The first one is the epsilon which is the maximum distance to consider and the minimum points required to form a cluster inside this radius.
Create SQL Server Objects
First we will create some tables, views, functions, and stored procedures in SQL Server for our solution.
Table
We need just one table to hold our data and calculations.
-- MSSQLTips (TSQL)
CREATE TABLE [dbo].[OpticsData](
[PointId] [int] IDENTITY(1,1) NOT NULL,
[X] [float] NOT NULL,
[Y] [float] NOT NULL,
[ClusterId] [int] NULL,
[Ordering] [int] NULL,
[CoreDist] [float] NULL,
[ReachDist] [float] NULL,
[IsSeed] [bit] DEFAULT 0,
[IsProcessed] [bit] DEFAULT 0,
[SpatialPoint] AS ([geometry]::Point([x],[y],(0))),
CONSTRAINT [PK_OpticsData] PRIMARY KEY CLUSTERED
(
[X] ASC,
[Y] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GOView
To show only the main columns.
-- MSSQLTips (TSQL)
CREATE OR ALTER VIEW [dbo].[vOpticsData]
AS
SELECT PointId, Ordering, ClusterId, CoreDist, ReachDist, X, Y
FROM dbo.OpticsData
GOUser defined functions
This function will calculate the distances using the Euclidean method.
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20280828
-- Description: Optics Euclidean distance
-- =============================================
CREATE OR ALTER FUNCTION [dbo].[ufnOpticsEuclideanDist]
(@PointId1 int
,@PointId2 int)
RETURNS float
AS
BEGIN
DECLARE @Dist float
,@x1 float
,@y1 float;
SELECT @x1 = [X]
,@y1 = [Y]
FROM [dbo].[OpticsData]
WHERE [PointId] IN (@PointId1);
SELECT @Dist = SQRT(SQUARE(@x1 - [X]) + SQUARE(@y1 - [Y]))
FROM [dbo].[OpticsData]
WHERE [PointId] IN (@PointId2);
RETURN @Dist;
END
GOThis function will calculate the distances using the Manhattan method.
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20280828
-- Description: Optics Euclidean distance
-- =============================================
CREATE OR ALTER FUNCTION [dbo].[ufnOpticsManhattanDist]
(@PointId1 int
,@PointId2 int)
RETURNS float
AS
BEGIN
DECLARE @Dist float
,@x1 float
,@y1 float;
SELECT @x1 = [X]
,@y1 = [Y]
FROM [dbo].[OpticsData]
WHERE [PointId] IN (@PointId1);
SELECT @Dist = (ABS(@x1 - [X]) + ABS(@y1 - [Y]))
FROM [dbo].[OpticsData]
WHERE [PointId] IN (@PointId2);
RETURN @Dist;
END
GOStored procedures
The first store procedure is for the OPTICS algorithm. I used the value of 1E50 as infinitum.
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20250609
-- Description: OPTICS clustering
-- =============================================
CREATE OR ALTER PROCEDURE [dbo].[uspOpticsClustering]
(@Epsilon decimal(18,6)
,@MinPts int
,@Threshold float
,@E_Or_M_Calc char(1) = 'E')
WITH EXECUTE AS CALLER
AS
BEGIN
SET NOCOUNT ON;
IF (SELECT COUNT(*) FROM [dbo].[OpticsData]) < 1
RETURN;
BEGIN TRY
-- E_Or_M_Calc: (E) for Euclidian and (M) for Manhattan distances
DECLARE @i int
,@j int
,@cd float
,@rd float
,@ordering int = 0
,@clusterId int = 0;
UPDATE [dbo].[OpticsData]
SET [ClusterId] = NULL
,[CoreDist] = NULL
,[ReachDist] = 1E50
,[IsProcessed] = 0
,[IsSeed] = 0;
-- Core distance ==========================================================
DECLARE @Neighbors
TABLE (p1 int NOT NULL
,p2 int NOT NULL
,dist float
,pos int
,PRIMARY KEY (p1, p2));
IF @E_Or_M_Calc = 'E'
INSERT INTO @Neighbors
SELECT O1.PointId
,O2.PointId
,[dbo].[ufnOpticsEuclideanDist] (O1.PointId, O2.PointId)
,ROW_NUMBER() OVER(PARTITION BY O1.PointId ORDER BY [dbo].[ufnOpticsManhattanDist] (O1.PointId, O2.PointId), O2.PointId)
FROM [dbo].[OpticsData] O1 CROSS JOIN
[dbo].[OpticsData] O2
WHERE O1.PointId <> O2.PointId AND
[dbo].[ufnOpticsEuclideanDist] (O1.PointId, O2.PointId) <= @epsilon;
ELSE
INSERT INTO @Neighbors
SELECT O1.PointId
,O2.PointId
,[dbo].[ufnOpticsManhattanDist] (O1.PointId, O2.PointId)
,ROW_NUMBER() OVER(PARTITION BY O1.PointId ORDER BY [dbo].[ufnOpticsManhattanDist] (O1.PointId, O2.PointId), O2.PointId)
FROM [dbo].[OpticsData] O1 CROSS JOIN
[dbo].[OpticsData] O2
WHERE O1.PointId <> O2.PointId AND
[dbo].[ufnOpticsManhattanDist] (O1.PointId, O2.PointId) <= @epsilon;
UPDATE [dbo].[OpticsData]
SET [CoreDist] = dist
FROM @Neighbors N
WHERE [PointId] = N.p1 AND
N.pos = 1;
-- Reachability distance ==================================================
WHILE EXISTS
(SELECT 1
FROM [dbo].[OpticsData]
WHERE [IsProcessed] = 0) BEGIN
UPDATE [dbo].[OpticsData]
SET [IsSeed] = 0;
SELECT TOP 1 @i = [PointId]
,@cd = [CoreDist]
FROM [dbo].[OpticsData]
WHERE [IsProcessed] = 0
ORDER BY [PointId];
UPDATE [dbo].[OpticsData]
SET [IsProcessed] = 1
,[Ordering] = @ordering
WHERE [PointId] = @i;
SET @ordering += 1;
WITH cteReach AS
(SELECT p2
,CASE WHEN @cd > dist
THEN @cd
ELSE dist
END AS NewReach
FROM [dbo].[OpticsData] LEFT OUTER JOIN
@Neighbors ON
[PointId] = p1
WHERE p1 = @i)
UPDATE [dbo].[OpticsData]
SET [ReachDist] = NewReach
,[IsSeed] = 1
FROM cteReach
WHERE [ReachDist] > NewReach AND
[PointId] = p2 AND
[IsProcessed] = 0;
WHILE EXISTS
(SELECT 1
FROM [dbo].[OpticsData]
WHERE [IsSeed] = 1) BEGIN
SELECT TOP 1 @j = [PointId]
,@cd = [CoreDist]
FROM [dbo].[OpticsData]
WHERE [IsProcessed] = 0
ORDER BY [ReachDist];
WITH cteReach AS
(SELECT p2
,CASE WHEN @cd > dist
THEN @cd
ELSE dist
END AS NewReach
FROM [dbo].[OpticsData] LEFT OUTER JOIN
@Neighbors ON
[PointId] = p1
WHERE p1 = @j)
UPDATE [dbo].[OpticsData]
SET [ReachDist] = NewReach
,[IsSeed] = 1
FROM cteReach
WHERE [ReachDist] > NewReach AND
[PointId] = p2 AND
[IsProcessed] = 0;
UPDATE [dbo].[OpticsData]
SET [IsProcessed] = 1
,[IsSeed] = 0
,[Ordering] = @ordering
WHERE [PointId] = @j;
SET @ordering += 1;
END
END;
-- Clustering Id ==========================================================
DECLARE crsCluster CURSOR FOR
SELECT [CoreDist]
,[ReachDist]
FROM [dbo].[OpticsData]
ORDER BY [Ordering]
FOR UPDATE OF [ClusterId];
OPEN crsCluster
FETCH NEXT FROM crsCluster INTO @cd,@rd
WHILE @@FETCH_STATUS = 0 BEGIN
IF @rd = 1E50
SET @clusterId += 1;
IF @cd <= @threshold BEGIN
UPDATE [dbo].[OpticsData]
SET [ClusterId] = @clusterId
WHERE CURRENT OF crsCluster;
END ELSE
SET @clusterId += 1;
FETCH NEXT FROM crsCluster INTO @cd,@rd
END
CLOSE crsCluster
DEALLOCATE crsCluster
-- Results ================================================================
SELECT [PointId]
,[Ordering]
,[ClusterId]
,[CoreDist]
,[ReachDist]
,[X]
,[Y]
FROM [dbo].[vOpticsData]
ORDER BY [Ordering];
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
BEGIN
ROLLBACK TRANSACTION;
END
-- Print error information.
PRINT 'Error: ' + CONVERT(varchar(50), ERROR_NUMBER()) +
', Severity: ' + CONVERT(varchar(5), ERROR_SEVERITY()) +
', State: ' + CONVERT(varchar(5), ERROR_STATE()) +
', Procedure: ' + ISNULL(ERROR_PROCEDURE(), '-') +
', Line: ' + CONVERT(varchar(5), ERROR_LINE());
PRINT ERROR_MESSAGE();
END CATCH;
END
GOThe second stored procedure is to visualize the plotted clusters.
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20250904
-- Description: Optics clustering show
-- =============================================
CREATE OR ALTER PROCEDURE [dbo].[uspOpticsDataShow]
(@Scale float = 0.01)
WITH EXECUTE AS CALLER
AS
BEGIN;
BEGIN TRY
DECLARE @OverallExtent geometry
,@ProportionalBuffer float;
SELECT @OverallExtent = geometry::EnvelopeAggregate([SpatialPoint])
FROM [dbo].[OpticsData];
DECLARE @MinX float = @OverallExtent.STPointN(1).STX;
DECLARE @MinY float = @OverallExtent.STPointN(1).STY;
DECLARE @MaxX float = @OverallExtent.STPointN(3).STX;
DECLARE @MaxY float = @OverallExtent.STPointN(3).STY;
DECLARE @ExtentWidth float = @MaxX - @MinX;
DECLARE @ExtentHeight float = @MaxY - @MinY;
DECLARE @MinDimension float = IIF(@ExtentWidth < @ExtentHeight, @ExtentWidth, @ExtentHeight);
DECLARE @Ratio float = @MinDimension * @scale;
-- Returning the clusters
SELECT [ClusterId]
,COUNT(*) AS N
,geometry::CollectionAggregate(SpatialPoint) AS geom
FROM (SELECT ClusterId
,CASE
WHEN ClusterId IS NULL THEN SpatialPoint.STBuffer(@Ratio / 2)
ELSE SpatialPoint.STBuffer(@Ratio * 0.75 )
END AS SpatialPoint
FROM [dbo].[OpticsData]) AS X
GROUP BY [ClusterId];
END TRY
BEGIN CATCH
PRINT 'Error: ' + CONVERT(varchar(50), ERROR_NUMBER()) +
', Severity: ' + CONVERT(varchar(5), ERROR_SEVERITY()) +
', State: ' + CONVERT(varchar(5), ERROR_STATE()) +
', Procedure: ' + ISNULL(ERROR_PROCEDURE(), '-') +
', Line: ' + CONVERT(varchar(5), ERROR_LINE());
PRINT ERROR_MESSAGE();
END CATCH;
END
GOExamples of using the algorithm
Below are three different examples of using the algorithm using the SQL code we created.
Example 1
-- MSSQLTips (TSQL)
TRUNCATE TABLE [dbo].[OpticsData];
INSERT INTO [dbo].[OpticsData]
([X]
,[Y])
VALUES (1.0, 1.0),(1.2, 1.1),(0.8, 0.9),(5.0, 5.0),
(5.2, 5.1),(4.9, 5.2),(10.0, 10.0),(10.1, 9.9),
(9.8, 10.2),(13.0, 13.0);
EXECUTE [dbo].[uspOpticsClustering] 2,2,0.5;
EXECUTE [dbo].[uspOpticsDataShow] 0.04;
GOThe first stored procedure results:

Visualize the plotted clusters:

Example 2
-- MSSQLTips (TSQL)
TRUNCATE TABLE [dbo].[OpticsData];
INSERT INTO [dbo].[OpticsData]
([X]
,[Y])
VALUES ( 1, 1),( 2, 1),( 1, 2),( 2, 2),( 3, 5)
,( 3, 9),( 3,10),( 4,10),( 4,11),( 5,10)
,( 7,10),(10, 9),(10, 6),( 9, 5),(10, 5)
,(11, 5),( 9, 4),(10, 4),(11, 4),(10, 3);
EXECUTE [dbo].[uspOpticsClustering] 1.75,4,1;
EXECUTE [dbo].[uspOpticsDataShow] 0.03;
GOThe first stored procedure results:

The ClusterId does not follow a numeric sequence order, but defines the cluster.
Visualize the plotted clusters:

Example 3 – Happy face
Load to the Optics data table the happy face data using my article Generating shaped-bounded random points in SQL Server, and run
-- MSSQLTips (TSQL)
EXECUTE [dbo].[uspOpticsClustering] 0.2,10,0.1;
EXECUTE [dbo].[uspOpticsDataShow] 0.02;
GOVisualize the plotted clusters:

Next Steps
As you can see Optics does not produce clusters itself and will depends on how you cut the reachability plot.
One improvement that can be done in the code above is to implement the Xi method to steep downward changes in reachability and starts or ends clusters based on those compared slopes of adjacent values.

Sebastião Pereira has over 40 years of experience in database development including T-SQL, algorithm design, machine learning and bringing innovative mathematical formulas to SQL Server. He started his career at a transnational fast-moving consumer goods (FMCG) company as an employee then later transitioning into a consultant role. He eventually founded his own company to develop software solutions for the healthcare industry. Sebastião is a respected award-winning author on MSSQLTips.com extending SQL Server capabilities beyond traditional workloads.
- MSSQLTips Awards
- Author of the Year – 2025
- Trendsetter (25+ tips) – 2025
- Rookie of the Year – 2024


