# Data Structures in R including Vector, Matrix, Array, List, and Data Frame

By:   |   Comments   |   Related: More > R Language

##### Problem

What type of data structures are available in R and how to do you use them in RStudio and in Microsoft SQL Server?

##### Solution

In this article, we will examine the main R data structures and provide examples of how to use them in both RStudio and SQL. The primary types of R data structures are Atomic Vector, Matrix, Array, List, and Data Frame.

## Vectors

R language provides two types of Vectors that are Atomic Vector and List. The main characteristic of Atomic Vectors is that all elements must be of the same kind, while a List can have aspects of different types.

#### Atomic Vector

The primary types of Atomic vectors are logical, integer, double, and character. Let us see how to define and use them.

Vector are create using the R command c() that stands for combine.

The below sample code shows how to create Atomic Vectors. It is interesting to see that once a Vector is created we can directly access a single element of it. For example, the second element of Vector chr_vet is the string "MSSQLTips". We can access it directly by typing chr_vct or print(chr_vct) in RStudio and print(chr_vct) if we are using SSMS.

To test if the Vector is Atomic we can use is.atomic(); typeof() to identify the type of Vector, length() to find the number of elements, and attributes() show additional arbitrary metadata.

```## Example of Atomic Vectors

# Integer
int_vct <- c(1L, 6L, 10L)

# Double
dbl_vct <- c(1, 2.5, 4.5)

# Logical
log_vct <- c(TRUE, FALSE, T, F)

# Character
chr_vct <- c("Hallo", "MSSQLTips", ".com")

# List all Vector elements
chr_vct

# List only 2nd element
chr_vct

# Test if Vector is atomic
is.atomic(chr_vct)

# Test Vector type
typeof(chr_vct)

# Show how many element is a vector
length(chr_vct)

# Display vector attributes
attributes(chr_vct)
``` Please notice that when we copy the code from RStudio to SSMS we have to introduce a print() function to have our results displayed in the SSMS Message windows.

```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
## Example of Atomic Vectors
# Integer
int_vct <- c(1L, 6L, 10L)
# Double
dbl_vct <- c(1, 2.5, 4.5)
# Logical
log_vct <- c(TRUE, FALSE, T, F)
# Character
chr_vct <- c("Hallo", "MSSQLTips", ".com")
# List all Vector elements
print(chr_vct)
# List only 2nd element
print(chr_vct)
# Test if Vector is atomic
print(is.atomic(chr_vct))
# Test Vector type
print(typeof(chr_vct))
# Show how many element is a vector
print(length(chr_vct))
# Display vector attributes
print(attributes(chr_vct))
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### List

Lists are different from atomic vectors because their elements can be of any type, including lists. You create a list by using the list() command.

The below example is used to create a list with different elements types. We use the R str() command to see the structure of any R objects, list included.

```myList <- list(1:5, "MSSQLTips", c(TRUE, FALSE, TRUE), c(3.3, 9.9, 12.2))
str(myList)
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
myList <- list(1:5, "MSSQLTips", c(TRUE, FALSE, TRUE), c(3.3, 9.9, 12.2))
print(str(myList))
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` The output of the str() command shows that our List is comprised of an Atomic Integer Vector, an Atomic (1 Element) Char Vector, an Atomic Logical Vector, and an Atomic numeric vector.

We can use command is.list() to check if the R object is a list and we can used typeof(), length(), and attributes() commands as well.

```Is.list(myList)
typeof(myList)
length(myList)
attributes(myList)
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
myList <- list(1:5, "MSSQLTips", c(TRUE, FALSE, TRUE), c(3.3, 9.9, 12.2))
print(is.list(myList))
print(typeof(myList))
print(length(myList))
print(attributes(myList))
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` To access to elements of a list we can use the following syntax:

• myList[] - returns the second element of the list "MSSQLTips"
• myList[] - returns the 3rd Atomic Logical Vector TRUE FALSE TRUE
• myList[] - returns the 2nd element of the 3rd Atomic Logical Vector value FALSE
```myList[]
myList[]
myList[]``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
myList <- list(1:5, "MSSQLTips", c(TRUE, FALSE, TRUE), c(3.3, 9.9, 12.2))
print(myList[])
print(myList[])
print(myList[])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` ## Attributes

All objects can have additional attributes, used to store metadata about the object. Attributes can be thought of as a named list (with unique names). Attributes can be accessed individually with attr() or all at once (as a list) with attributes().

Let us see an example of how to assign an attribute value to a List

```# Assign Attribute
attr(myList,"My Attribute") <- "My First Attribute"
# Display Attribute
attr(myList,"My Attribute")
attributes(myList)
str(attributes(myList))
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
myList <- list(1:5, "MSSQLTips", c(TRUE, FALSE, TRUE), c(3.3, 9.9, 12.2))
# Assign Attribute
attr(myList,"My Attribute") <- "My First Attribute"
# Display Attribute
attr(myList,"My Attribute")
print(attributes(myList))
print(str(attributes(myList)))
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` ## Matrix and Array

A Matrix is a two-dimensional Atomic array and they are used commonly as part of the mathematical machinery of statistics. R Matrix is created using matrix() command while array using array() command.

In the following example, we will be creating a matrix and execute a few basic operations on it.

```# Create a new Matrix
myMatrix <- matrix(1:6, ncol = 3, nrow = 2)
# Display Matrix Values
myMatrix
# Display Matrix Structure
str(myMatrix)
# Display Matrix length and number of columns and rows
length(myMatrix)
nrow(myMatrix)
ncol(myMatrix)
# Assign a value to a Matrix Cell
myMatrix[2,2] <- 99
# Display Matrix Values
myMatrix
``` It is also possible define a name for each column and row in a Matrix using rownames() and colnames() functions.

```rownames(myMatrix) <- c("X", "Y")
colnames(myMatrix) <- c("A", "B", "C")
myMatrix
``` It is possible to execute mathematical operation fairly easy on Matrix, for example let's execute a sum of two Matrices.

```# Define a New Matrix
myMatrix1 <- matrix(1:6, ncol = 3, nrow = 2)
myMatrix1
# Matrix Sum
m <- myMatrix1 + myMatrix
m
``` We can execute the same type of code using SSMS. An array is a vector with one or more dimensions. So, an array with one dimension is (almost) the same as a vector. An array with two dimensions is (almost) the same as a matrix. An array with three or more dimensions is an n-dimensional array.

Let us see it with an example.

```#Example array code:
myarr = array(0.0, 3)  # [0.0 0.0 0.0] Vector
print(arr)
``` ```# Add a dimension and we get a Matrix
myarr = array(0.0, c(2,3))  # 2x3 matrix
print(arr)
``` ```#Add another Dimension
myarr = array(0.0, c(2,5,4)) # 2x5x4 n-array
print(myarr)  # 40 values displayed
```  ## Data Frames

A Data Frame is the most common way of storing and working with data in R. Data Frames are nothing more than a list of equal-length vectors, making them a 2-dimensional structure. Data Frames share the properties of both the matrix and list.

Data Frame has names(), colnames(), and rownames(), although names() and colnames() are the same thing. The length() of a data frame is the length of the underlying list, and so is the same as ncol(); nrow() gives the number of rows.

Let us see how to create and work with a Data Frame. Our first example creates a Data Frame of 5 objects and two variables x and y.

```# Create a Data Frame
df <- data.frame(x = 1:5, y = c("a", "b", "c", "d","e"))
str(df)
```  ```# Test the following function on data Frames
names(df)
colnames(df)
rownames(df)
length(df)
ncol(df)
nrow(df)
``` Let us add a new column and an new row of data to the existing Data Frame using cbind() and rbing() R function.

```# Add a new column
cbind(df, data.frame(z = 9:5))
``` ```# Add a new row
rbind(df, data.frame(x=9, y="Z"))
``` Data Frames are so important and flexible that we can have a list of vectors as a Data Frame column.

```# Create a Data Frame
df <- data.frame(x = c("A","B","C"))
# Display Data Frame
print(df)
# Display only the the Data Frame
print(df\$x)
# Assign a list of Vectors
df\$y <- list(1:2, 3:5, 6:9)
print(df)
print(df\$y)
``` We can also see how the above code works in SSMS.  ## Conclusion

In the tip we have learned the main R data types and how to create and use them. In the tip we will talk about Subsetting of data type and how to add, remove and ordering for a Data Frame.

##### Next Steps
• The reader will need to install RStudio in order to test this tip.
• Check out these tips Matteo Lorini is a DBA and has been working in IT since 1993. He specializes in SQL Server and also has knowledge of MySQL.

View all my tips