Stored Procedure in SQL Server with R Code


By:   |   Updated: 2020-11-24   |   Comments   |   Related: More > R Language


Problem

I have a basic knowledge of R, is it possible to create custom SQL Server stored procedures using the R language?

Solution

In this article we will explore how to create custom SQL Server stored procedures using R. An R stored procedure is basically an R function that can be used inside SQL Server. In this tip we will use RStudio to create and test our R code and deploy it to SQL Server as a stored procedure.

Stored Procedure in SQL Server with R Code

In order to create a stored procedure inside RStudio, we need to install and use the sqlutils R package. This package contains all the necessary commands and functions to deploy a stored procedure to SQL Server. As usual, the R help function can help display sqlrutils package documentation.

# Install Package
install.packages("sqlrutils")
# Show available Documentation
help(package="sqlrutils")
wraps r code into executable sql server stored procedures

In this tip I will show how to create an R stored procedure using an example. The idea is to create a procedure that takes as input the name of an Excel file and return its contents and the number of rows loaded.

As an example, I have created an Excel spreadsheet ('D:\Test\SalesOrderID.xlsx') that contains the first 1000 rows of table SalesOrderDetail in the AdventurWorks2014 database.

/****** Script for SelectTopNRows command from SSMS  ******/
SELECT TOP(1000) [SalesOrderID]
      ,[SalesOrderDetailID]
      ,[CarrierTrackingNumber]
      ,[OrderQty]
      ,[ProductID]
      ,[SpecialOfferID]
      ,[UnitPrice]
      ,[UnitPriceDiscount]
      ,[LineTotal]
      ,[rowguid]
      ,[ModifiedDate]
  FROM [AdventureWorks2014].[Sales].[SalesOrderDetail] 

The query output has been saved into an Excel spreadsheet named SalesOrderID.xlsx.

first 1000 rows of table SalesOrderID

Step 1: Create the R function to be used in our stored procedure to read Excel files. The myReadXcelFunc function will do:

  1. Takes the file name as input parameter
  2. Read and transform the entire file into a data frame
  3. Counts the number of records loaded
  4. Generate and output list that contains the dataset and the number of records
myReadXcelFunc <- function (srcFileName) 
 {
   library(readxl)
   # Read file into data frame
   myFile <- read_excel(srcFileName)
   # Count the number of record loaded
   numRows <- nrow(myFile)
   myFileDF <- data.frame(myFile)
   retList <- list(xlsxFile = myFileDF, xlsxRows = numRows)
   return(retList)
 }
 
# Test our function 
srcFileName = 'D:/Test/SalesOrderID.xlsx'
 x <- myReadXcelFunc(srcFileName)
 View(x)  

The R View() function shows that our dataset has 1000 rows and 11 columns and that the number of rows imported is equal to 1000. This step was necessary to verify that our function works.

contents of file

Step 2: Define the input/output parameters and create the stored procedure definition. At this point we need to mention that we can have multiple input and output parameters, but we can only have one output data set.

In order to work with our function, we need to pass the file name as an input parameter, the row count as an output parameter and the data set that contains the values read for the input Excel files.

Once the parameters have been created, we just need to link our new stored procedure R_myReadXcelProc with the R Function myReadXcelFunc to its parameters.

 inParam <- InputParameter("srcFileName", "character")
outParam <- OutputParameter("xlsxRows", "numeric")
outputData <- OutputData("xlsxFile")
 
myReadXcelProc <- StoredProcedure(myReadXcelFunc,
                             "R_myReadXcelProc",
                             inParam, 
                             outputData,
                             outParam,
                             filePath = "C:\\Temp"
 )
r commands

Step 3: Deploy our function into a stored procedure. This step is straight forward, we need to define our SQL Server connection string and use R command registerStoredProcedure to deploy our procedure to SQL Server.

conStr <- paste0("Driver={ODBC Driver 13 for SQL Server};Server=xxxxxxxxx;",
                  "Database=AdventureWorks2014;uid=xxxxxxxxx;pwd=xxxxxxxxx;")
 
registerStoredProcedure(myReadXcelProc, conStr)
r commands

Step 4: Below is the full R Script.

myReadXcelFunc <- function (srcFileName) 
 {
   library(readxl)
   
   myFile <- read_excel(srcFileName)
   numRows <- nrow(myFile)
   myFileDF <- data.frame(myFile)
   retList <- list(xlsxFile = myFileDF, xlsxRows = numRows)
   return(retList)
 }
 
 inParam <- InputParameter("srcFileName", "character")
 outParam <- OutputParameter("xlsxRows", "numeric")
 outputData <- OutputData("xlsxFile")
 
 myReadXcelProc <- StoredProcedure(myReadXcelFunc,
                             "R_myReadXcelProc",
                             inParam, 
                             outputData,
                             outParam,
                             filePath = "C:\\Temp"
 )
  
 conStr <- paste0("Driver={ODBC Driver 13 for SQL Server};Server=xxxxxxxxxx;",
                  "Database=AdventureWorks2014;uid=xxxxxxxx;pwd=xxxxxxxxxx;")
 
 registerStoredProcedure(myReadXcelProc, conStr)
r commands

Step 5: Let's check if our new procedure has been deployed to our SQL Server.

ssms object explorer

Step 6: Review the stored procedure code. We can right click on it in the list above and select the appropriate option to display the code as shown below.

USE [AdventureWorks2014]
GO
 
/****** Object: StoredProcedure [dbo].[R_myReadXcelProc]   ******/
SET ANSI_NULLS ON
GO
 
SET QUOTED_IDENTIFIER ON
GO
 
 
CREATE PROCEDURE [dbo].[R_myReadXcelProc]
  @parallel_outer bit = 0,
  @srcFileName_outer varchar(max),
  @xlsxRows_outer float output
AS
  BEGIN TRY
    exec sp_execute_external_script
     @language = N'R',
      @script = N'
       myReadXcelFunc <- function (srcFileName)
       {
           library(readxl)
           myFile <- read_xlsx(srcFileName)
           numRows <- nrow(myFile)
           myFileDF <- data.frame(myFile)
           retList <- list(xlsxFile = myFileDF, xlsxRows = numRows)
           return(retList)
       }
       result <- myReadXcelFunc(srcFileName = srcFileName)
       if (is.list(result)) {
         OutputDataSet <- result$xlsxFile
         xlsxRows <- result$xlsxRows
       } else stop("the R function must return a list")
     ',
      @parallel = @parallel_outer,
      @params = N'@srcFileName varchar(max), @xlsxRows float output',
      @srcFileName = @srcFileName_outer,
      @xlsxRows = @xlsxRows_outer output
  END TRY
  BEGIN CATCH
    THROW;
  END CATCH;
GO

Step 7: Test the new stored procedure.

DECLARE @out float;
EXEC R_myReadXcelProc @parallel_outer = 0,
                 @srcFileName_outer = 'd:/test/SalesOrderID.xlsx',
                 @xlsxRows_outer = @out OUT;
SELECT @out;
query output

Step 8: We have verified that the stored procedure works and returns the contents of the Excel file. Now look at the parameters that were deployed by the sqlutils package.

r commands

@parallel_outer indicates if we want to execute the R script in parallel or not. The default value is 0 for no parallel execution. The other possible value is 1 that means parallel execution.

_outer - Not sure if you noticed, the script that deployed our store procedure attached the word "_outer" to our input parameter. This looks like a naming convention to separate the parameter used by sp_execute_external_script from the parameters for the outer procedure.

Conclusion

In this tip we have learned how to create and use an R stored procedure in SQL Server. We have learned how to create the stored procedure in RStudio and how to deploy it to SQL Server. In my opinion, we created a very powerful procedure capable of reading Excel files without leaving the SQL Server Management Studio (SSMS) environment.

Next Steps


Last Updated: 2020-11-24


get scripts

next tip button



About the author
MSSQLTips author Matteo Lorini Matteo Lorini is a DBA and has been working in IT since 1993. He specializes in SQL Server and also has knowledge of MySQL.

View all my tips



Comments For This Article





download





Recommended Reading

Import Data from PDF to SQL Server

Using Simple Linear Regression to Make Predictions

sqldf in R Example for SQL Server

Index Numbers Explained with Examples in R

Exploring Four Simple Time Series Forecasting Methods with R Examples














get free sql tips
agree to terms