Learn more about SQL Server tools

mssqltips logo
giveaway
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Min-Max Normalization in T-SQL when the Boundaries are not [0,1]


By:   |   Read Comments   |   Related Tips: 1 | 2 | More > T-SQL

Problem

I'm tracking weekly quantities of product sales and I would like to create a normalized score of product sales each week from 1 to 100. How can I do this in T-SQL?

Solution

This tip is in response to a reader of MSSQLTips.com. I published a tip on min-max normalization when the boundaries are from zero to one inclusive, which is also written as [0,1]. To perform a min-max normalization when the boundaries are not [0,1], the formula has more components, but is not much more complicated as shown below.

normalized_score = ((OriginalDataValue-MinValue)/(MaxValue-MinValue))*(NormalizedMaxValue-NormalizedMinValue)+1

We will apply the formula to every row in a column. Also, further complicating this problem is the need to create a new set of normalized scores each week, which we will address below.

Let's begin by creating a table and inserting some example rows. (I know this is not third normal form and I should use a product ID instead of product name, but it makes the tip easier to read.)

create table productSales
(
  week_num  int,
  product_name  varchar(5),
  sales float,  
  normalized_score float
)
insert into productSales values (1, 'pears',  5.1, 0)
insert into productSales values (1, 'beans', 10.8, 0)
insert into productSales values (1, 'flour', 15.4, 0)
insert into productSales values (1, 'plums', 25.3, 0)
insert into productSales values (2, 'pears',  8.9, 0)
insert into productSales values (2, 'beans', 12.1, 0)
insert into productSales values (2, 'flour', 21.0, 0)
insert into productSales values (2, 'plums', 16.4, 0)
insert into productSales values (3, 'pears', 10.5, 0)
insert into productSales values (3, 'beans', 23.9, 0)
insert into productSales values (3, 'flour', 48.2, 0)
insert into productSales values (3, 'plums',  9.1, 0)

As you can see above, the sales of each product increases each week except for the plums, which decrease each week.

We will need to calculate the min and max for each week, which is accomplished by the following code and is shown in the screen snip below. Similar to other programming languages, it is always a good idea to test smaller segments of T-SQL to make testing and debugging easier.

select week_num, min(sales) as MinSales, max(sales) as MaxSales 
from productSales
group by week_num 
order by week_num


Test of min and max calculations in T-SQL

With the above code working correctly, we are now ready to perform our normalized score calculations. We will use an inner join to the min and max values per week in a nested subquery.

select productSales.week_num, 
productSales.sales, 
productSales.product_name,
salesStats.MinSales, 
salesStats.MaxSales,
((productSales.sales-MinSales)/(MaxSales-MinSales))*(100-1)+1 as normalized_score
from productSales
inner join 
(
   select week_num, min(sales) as MinSales,
   max(sales) as MaxSales 
   from productSales
   group by week_num
) as salesStats on productSales.week_num=salesStats.week_num
order by productSales.week_num asc, normalized_score desc

The results from the SELECT statement are in the next image. Notice how each week has normalized values of 1 and 100 that represent the weekly min and max sales, respectively.

The results from the SELECT statement

In the T-SQL code below, we will execute an UPDATE statement to update the value in the normalized_score column.

update productSales
set normalized_score = ((productSales.sales-MinSales)/(MaxSales-MinSales))*(100-1)+1 
from productSales
inner join 
(
   select week_num, min(sales) as MinSales,
   max(sales) as MaxSales 
   from productSales
   group by week_num
) as salesStats on productSales.week_num=salesStats.week_num

The screen snip below shows the results from the SELECT statement on our updated table.

Selecting from the updated table
Next Steps

Always make sure to test your code so you can be sure that your calculations are correct. Finally, please check out these other tips and tutorials on T-SQL on MSSQLTips.com.



Last Update:






About the author
MSSQLTips author Dallas Snider Dr. Dallas Snider is an Assistant Professor in the Computer Science Department at the University of West Florida and has 18+ years of SQL experience.

View all my tips





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Learn more about SQL Server tools