R Functions and Operators in SQL Server


By:   |   Updated: 2020-09-23   |   Comments (3)   |   Related: More > R Language


Problem

I have a basic knowledge of R, but what are the main operators and functions that I need to be familiar with to start R development?

Solution

R exposes a range of powerful functions. In this article, we will examine a few of the main and most used functions.

Help for R Functions

The Help function is the most common and important function in R. For example, if we are unsure what R function str does we can execute the following code to display the online manual page of str.

# Help
? str
R help screen

Basic R Operators and Assignments

R has several assignments and operators like: %in%, match, =, <-, <<-, $, [, [[, head, tail, subset, with, assign and get.

In this section; we will focus on two (%in% and match) that I think are less intuitive.

The %in% operator in R, is used to identify if an element belongs to a vector. In our example, element a1 belongs to vector t, %in% operator will return TRUE, and it will return FALSE on a2 because a2 does not belong to t.

a1 <- 13
a2 <- 76
t <- c(1,2,3,4,5,6,7,8,13,20,77)
print(a1 %in% t)
print(a2 %in% t)
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
    # R %in% operator
   a1 <- 13
   a2 <- 76
   t <- c(1,2,3,4,5,6,7,8,13,20,77)
   print(a1 %in% t)
   print(a2 %in% t)
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

The match() function returns a vector of the position of first occurrence of the vector1 in vector2. If the element of the vector1 does not exist in vector2, NA is returned. Let's see it with an example:

# Return position of 55
print(match(55, c(51,52,59,55,53,56,57,54,55)))
 
# Return the position of each element ib v1 that are in v2
 v1 <- c("M1","S2","Q1","L2")
 v2 <- c("g1","L2","d2","e2","f1","M1","c2","S2","a2")
 x <- match(v1,v2)
 print(x)
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
   # Return position of element 55
   print(match(55, c(51,52,59,55,53,56,57,54,55)))
 
   # Return the position of each element in v1 that are in v2
    v1 <- c("M1","S2","Q1","L2")
    v2 <- c("g1","L2","d2","e2","f1","M1","c2","S2","a2")
    x <- match(v1,v2)
    print(x)
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

R Comparison functions

R has several comparison operators like: all.equal, identical, !=, ==, >, >=, <, <=, is.na, complete.cases and is.finite.

In this section we will examine the following: all.equal, identical, is.na, complete.cases and is.finite.

 # Identical and all.equal
 t1 <- c(1,2,3,4,5,6,7,8,13,20,77)
 t2 <- c(1,2,3,4,5,6,7,8,13,20,77)
 print(all.equal(t1, t2))
 print(identical(t1,t2))
 
 t1 <- c(1,2,3,4,5,6,7,8,13,20,77)
 t2 <- c(1,2,4,3,5,6,7,8,13,20,77)
 print(all.equal(t1, t2))
 print(identical(t1,t2))
query results

Example using R in SQL Server Management Studio.

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
     # Identical and all.equal
    t1 <- c(1,2,3,4,5,6,7,8,13,20,77)
    t2 <- c(1,2,3,4,5,6,7,8,13,20,77)
    print(all.equal(t1, t2))
    print(identical(t1,t2))
 
    t1 <- c(1,2,3,4,5,6,7,8,13,20,77)
    t2 <- c(1,2,4,3,5,6,7,8,13,20,77)
    print(all.equal(t1, t2))
    print(identical(t1,t2))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

The complete.cases function is often used to identify complete rows of a data frame. In the following example we will show how to use complete.cases and is.na functions.

# Create a data frame with incomplete data
 data <- data.frame(x1 = c(7, 2, 1, NA, 9), 
                    x2 = c(1, 3, 1, 9, NA),
                    x3 = c(NA, 8, 8, NA, 5))
 # This is how our example data looks like
 print(data) 
 # This is the return values of complete.cases function
 print(complete.cases(data))
 # substitute NA values with -1 value
 data[is.na(data)] <- -1
 # This is how our example data looks like
 print(data)
 # This is the return values of complete.cases function
 print(complete.cases(data))
query results

Example using R in SQL Server Management Studio.

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
    # Create a data frame with incomplete data
    data <- data.frame(x1 = c(7, 2, 1, NA, 9), 
                  x2 = c(1, 3, 1, 9, NA),
                  x3 = c(NA, 8, 8, NA, 5))
    # This is how our example data looks like
    print(data) 
    # This is the return values of complete.cases function
    print(complete.cases(data))
    # substitue NA values with -1 value
    data[is.na(data)] <- -1
    # This is how our example data looks like
    print(data)
    # This is the return values of complete.cases function
    print(complete.cases(data))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

is.finite and is.infinite return a vector of the same length indicating which elements are finite and which one are not.

t1 <- c(1,2,3)
 is.finite(t1)
 
t2 <- c(1,NA,3)
 is.finite(t2)
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
     t1 <- c(1,2,3)
     print(t1)
     print(is.finite(t1))
 
     t2 <- c(1,NA,3)
     print(t2)
     print(is.finite(t2))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

Basic R math operators

R has many math operators, for example the most common are: *, +, -, /, ^, %%, %/%, abs, sign, cacos, asin, atan, atan2, csin, cos, tan, ceiling, floor, round, trunc, signif, exp, log, log10, log2, sqrt, max, min, prod, sum, cummax, cummin, cumprod, cumsum, diff, pmax, pmin, crange, cmean, median, cor, sd, var rle.

The goal of this article is not to teach all of the operators, but to show how we can understand and work with them. For example, suppose that we are not familiar with rle operator. First of all, we can execute the help on such operator:

? rle
r help rle

Let's see how it works with an example:

  x <- c(1,2,2,2,5,6,2,8,8,10,10,10)
  print(x)
  print(rle(x))
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
     x <- c(1,2,2,2,5,6,2,8,8,10,10,10)
     print(x)
     print(rle(x))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

R Logical and Sets operators

R has several comparison operators like: &, |, !, xor, all, any, intersect, union, setdiff, setequal and which.

In this section we will examine the following: setdiff, setequal and which.

The setdiff function operates row-wise on lists and dataframes. The elements of setdiff(x,y) returns elements in X but not in Y. Let's see it with an example:

# Create two sample lists
x <- c(1,2,2,2,5,6,2,8,8,10)
y <- c(1,2,2,3,5,6,2,8,8,0,9)
# Compare X to Y and print any differences
print(setdiff(x,y))
# Compare Y to X and print any differences
print(setdiff(y,x))
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
   # Create two sample lists
   x <- c(1,2,2,2,5,6,2,8,8,10)
   y <- c(1,2,2,3,5,6,2,8,8,0,9)
   # Compare X to Y and print any differences
   print(setdiff(x,y))
   # Compare Y to X and print any differences
   print(setdiff(y,x))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

Setequal(x,y) compares if the elements in X are the same of the element of y. Element order does not matter.

# Create two sample lists
x <- c(1,2,2,2,5,6,2,8,8,10)
y <- c(1,2,2,3,5,6,2,8,8,0,9)
# Check if they are equal
print(setequal(x,y))
x1 <- c(1,9,2,2,2,5,6,2,8,8,10)
y1 <- c(10,2,2,2,5,6,2,8,9,8,1)
# Check if they are equal
print(setequal(x1,y1))
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
   # Create two sample lists
   x <- c(1,2,2,2,5,6,2,8,8,10)
   y <- c(1,2,2,3,5,6,2,8,8,0,9)
   # Check if they are equal
   print(setequal(x,y))
   x1 <- c(1,9,2,2,2,5,6,2,8,8,10)
   y1 <- c(10,2,2,2,5,6,2,8,9,8,1)
   # Check if they are equal
   print(setequal(x1,y1))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO
query results

Which indicates the elements in a list, matrix or data frame are true. Let's see it with an example.

# Create a numeric matrix
( m <- matrix(1:12, 3, 4) )
# Select all teh value that are divisible by 5
div.5 <- m %% 5 == 0
which(div.5)
# Display the matrix coordinates of the values divisible by 5
which(div.5, arr.ind = TRUE)
query results

Example using R in SQL Server Management Studio:

DECLARE @rscript NVARCHAR(MAX);
  SET @rscript = N'
   # Create a numeric matrix
   ( m <- matrix(1:12, 3, 4) )
   print(m)
   # Select all teh value that are divisible by 5
   div.5 <- m %% 5 == 0
   print("Elements that are divisible by 5")
   print(which(div.5))
   # Display the matrix coordinates of the values divisible by 5
   print("Coordinate of Elements that are divisible by 5")
   print(which(div.5, arr.ind = TRUE))
';
  EXEC sp_execute_external_script
    @language = N'R',
    @script = @rscript;  
  GO 
query results

Conclusion

In this tip we have learned a few of the many R operators and functions. In the next tip we will see how to work with functions and how to create new functions.

Next Steps


Last Updated: 2020-09-23


get scripts

next tip button



About the author
MSSQLTips author Matteo Lorini Matteo Lorini is a DBA and has been working in IT since 1993. He specializes in SQL Server and also has knowledge of MySQL.

View all my tips



Comments For This Article




Thursday, September 24, 2020 - 10:43:04 AM - Cody Custis Back To Top (86529)
One cool point of order. I installed RStudio for some Power BI work. Without doing anything to configure in SQL Server Management Studio, it automatically recognized the R installation; I was able to run the example code with direct cut and paste.

Wednesday, September 23, 2020 - 9:04:11 AM - Matteo Lorini Back To Top (86523)
R language is more for Data Science, DBA task can be better accomplished using Powershell. R integration with SQL can help to create functionalities that are not present in SQL.

Thanks

Wednesday, September 23, 2020 - 4:34:42 AM - Eric Back To Top (86522)
Hi,
Nice article.
As a DBA how can the R language help us?


download





Recommended Reading

Import Data from PDF to SQL Server

Using Simple Linear Regression to Make Predictions

Data Structures in R including Vector, Matrix, Array, List, and Data Frame

sqldf in R Example for SQL Server

Exploring Four Simple Time Series Forecasting Methods with R Examples














get free sql tips
agree to terms