Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

SQL Server Min Max Column Normalization for Data Mining


By:   |   Read Comments   |   Related Tips: More > T-SQL

Attend our free MSSQLTips Webcast - How to Simplify Routine SQL Server Administration Tasks


Problem

In data mining, we often need to perform min-max normalization on numeric data type columns to prevent one column from skewing or dominating the models produced by the machine learning algorithms.

Solution

In this tip, we will demonstrate how to use T-SQL to perform a min-max normalization from 0.0 to 1.0 inclusive, which can be represented as [0.0, 1.0]. The normalized value equals (the original column value minus the minimum column value) divided by the (maximum column value minus the minimum column value).

We will create a table that contains two data columns, AttributeX and AttributeY.

IF EXISTS (SELECT name FROM sys.tables WHERE name = N'tblExample')
  DROP TABLE dbo.tblExample
go
create table dbo.tblExample
(
  pKey integer identity(1,1) PRIMARY KEY,
  AttributeX decimal(7,3),
  AttributeY decimal(7,3)
)

Next, we will populate this table with 27 rows of data. AttributeY will be 10 times the value of AttributeX.

insert into dbo.tblExample values (24, 240)
insert into dbo.tblExample values (26, 260)
insert into dbo.tblExample values (27, 270)
insert into dbo.tblExample values (27, 270)
insert into dbo.tblExample values (30, 300)
insert into dbo.tblExample values (31, 310)
insert into dbo.tblExample values (31, 310)
insert into dbo.tblExample values (32, 320)
insert into dbo.tblExample values (33, 330)
insert into dbo.tblExample values (33, 330)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (41, 410)
insert into dbo.tblExample values (44, 440)
insert into dbo.tblExample values (44, 440)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (47, 470)
insert into dbo.tblExample values (51, 510)
insert into dbo.tblExample values (56, 560)
insert into dbo.tblExample values (57, 570)
insert into dbo.tblExample values (63, 630)
insert into dbo.tblExample values (81, 810)

After inserting the rows, we will select the row count and the minimum and maximum values from our data columns.

select 
count(*) as RowQuantity,
min(AttributeX) as MinimumX,
max(AttributeX) as MaximumX,
min(AttributeY) as MinimumY,
max(AttributeY) as MaximumY
from dbo.tblExample


Data statistics

T-SQL Code for Min Max Normalization in SQL Server

Now we will use the T-SQL code below to perform min-max normalizations on AttributeX and AttributeY.

--Declare the variables
declare @minX as decimal(7,3)
declare @maxX as decimal(7,3)
declare @minY as decimal(7,3)
declare @maxY as decimal(7,3)

--Set the variables
set @minX=(select min(AttributeX) from dbo.tblExample)
set @maxX=(select max(AttributeX) from dbo.tblExample)
set @minY=(select min(AttributeY) from dbo.tblExample)
set @maxY=(select max(AttributeY) from dbo.tblExample)

--Perform the normalization
select 
AttributeX, 
(AttributeX - @minX)/(@maxX - @minX) as NormalizedX,
AttributeY, 
(AttributeY - @minY)/(@maxY - @minY) as NormalizedY 
from dbo.tblExample

The results from the previous query are shown below. Notice how the normalized values for AttributeX and AttributeY are equal on each row, while the original values differ.

Top 10 rows from device B
Next Steps

Normalized values can be inserted into a table and then read into data mining algorithms. You can read more about SQL Server data mining in the following tips from MSSQLTips.com.



Last Update:


signup button

next tip button



About the author
MSSQLTips author Dallas Snider Dr. Dallas Snider is an Assistant Professor in the Computer Science Department at the University of West Florida and has 18+ years of SQL experience.

View all my tips





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Learn more about SQL Server tools