# Subset in R including List, Data Frame, Matrix and Vector

By:   |   Comments   |   Related: More > R Language

##### Problem

I have loaded data into an R Data Frame or any other type of data structure; what are my options to extract, manipulate and work with my data?

##### Solution

R exposes a range of powerful and fast subsetting operations. Subsetting operations can be hard to learn, and they can be non-intuitive; however, learning how to subset R data is crucial to manipulate data.

In this article we will examine subsetting operators, types of subsetting, differences in behavior for different R objects like vectors, lists, and data frames.

## Atomic Vectors

Let's start with the easiest subsetting type of data structure in R that are Atomic Vectors. We will examine it by using a simple example of numeric vector.

```# Subsetting
x <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9, 10.1)
```

Elements of the vector are in order position, for example, value 5.5 is at position five in the vector. We can access a single element by using [], let see how it works with an example.

```# Get element at position 5
x
``` Using SSMS

```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Subsetting
x <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9, 10.1)
# Get element at position 5
print(x)
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### Get Specific Elements

If we want to access elements at position 1, 3 and 8 of our vector x, we use the following command.

```# Get elements at positions 1,3 and 8
x[c(1,3,8)]
```

Please note that the subsetting operation returns a Vector data type therefore; I had to use the c() command to combine the output results. ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Subsetting
x <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9, 10.1)
# Get elements at positions 1,3 and 8
print(x[c(1,3,8)])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### Omit Values from a Vector

We can use – (negative sign) to omit the value from a vector. The following command will return all the values of vector x except the one at position 3 and 1.

```# Omit elements at position 3,1
x[-c(3,1)]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Subsetting
x <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9, 10.1)
# Omit elements at position 3,1
print(x[-c(3,1)])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### Order Vector Elements

Order() function is used to order the Vector elements.

```# Order a vector
y <- c(10,1,7,-3,8)
y
y[order(y)]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Subsetting
# Order a vector
y <- c(10,1,7,-3,8)
print(y)
# Order Vector y
print(y[order(y)])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### Logical Vector

Subsetting can also be done using a logical vector for example, if we want to list element 1, 2 and 5, we can write the following subsetting logical vector.

```#Subsetting using logical vector
y
#  Return elements which position correspond to TRUE
y[c(TRUE, TRUE, FALSE, FALSE, TRUE)]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
y <- c(10,1,7,-3,8)
#  Return elements which position correspond to TRUE
print(y[c(TRUE, TRUE, FALSE, FALSE, TRUE)])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO	``` #### Filter Elements of a Vector

We can use logical operators like >, < and == to filter elements of a vector.

```#List elements greater than 4
y[y>4]
#List elements less than 4
y[y<4]
#List elements equal to 4
y[y==7]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
y <- c(10,1,7,-3,8)
#List elements greater than 4
print(y[y>4])
#List elements less than 4
print(y[y<4])
#List elements equal to 4
print(y[y==7])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` #### Assign Names to Elements in a Vector

Another interesting part is to assign names to elements in a Vector and filter the output based on names.

```# Assign names to an element vector
w <- setNames(x, letters[1:10])
#Display all Elements
w
# Select elements corresponding to letter b and h
w[c("b","h")]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
x <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9, 10.1)
# Assign names to an element vector
w <- setNames(x, letters[1:10])
#Display all Elements
print(w)
# Select elements corresponding to letter b and h
print(w[c("b","h")])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` ## List

Subsetting List operations are like the ones used for an Atomic Vector with the difference that [] always returns a list while [[]] operators returns a component of the List. Let's see it with examples.

```# Create a new List
v1 = c(20, 30, 50)
v2 = c("www", "mssql", "tips", ".", "com")
v3 = c(TRUE, FALSE, TRUE, FALSE, FALSE)
myList = list(v1, v2, v3, 99)
# display List
myList
``` ```  DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Create a new List
v1 = c(20, 30, 50)
v2 = c("www", "mssql", "tips", ".", "com")
v3 = c(TRUE, FALSE, TRUE, FALSE, FALSE)
myList = list(v1, v2, v3, 99)
# display List
print(myList)
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` Let's use [] to retrieve the second element of the list and [[]] to get the element "tips".

```# Get 2nd element of the List
myList
# Get the 3rd Value of the
myList[]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Create a new List
v1 = c(20, 30, 50)
v2 = c("www", "mssql", "tips", ".", "com")
v3 = c(TRUE, FALSE, TRUE, FALSE, FALSE)
myList = list(v1, v2, v3, 99)
# Get 2nd element of the List
print(myList)
# Get the 3rd Value of the
print(myList[])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` ## Matrix

The simple way to subset a Matrix is to use an index for each dimension. Let's see some examples of how to work with a Matrix.

```# Create a 3X3 Matrix
myMatrix <- matrix(1:9, nrow = 3)
# Assign names to the columns
colnames(myMatrix) <- c("A", "B", "C")
#Assign names to the rows
rownames(myMatrix) <- c("X", "Y", "Z")
# Display the Matrix
myMatrix
``` ```  DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Create a 3X3 Matrix
myMatrix <- matrix(1:9, nrow = 3)
# Assign names to the columns
colnames(myMatrix) <- c("A", "B", "C")
#Assign names to the rows
rownames(myMatrix) <- c("X", "Y", "Z")
# Display the Matrix
print(myMatrix)
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` Now that we have created our Matrix, let's see how we can access its elements.

```# Access Element 5 at coordinate 2,2
myMatrix[2,2]
myMatrix["Y","B"]
# Access Element 8 at coordinate 2,3
myMatrix[2,3]
myMatrix["Y","C"]
``` ```DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Create a 3X3 Matrix
myMatrix <- matrix(1:9, nrow = 3)
# Assign names to the columns
colnames(myMatrix) <- c("A", "B", "C")
#Assign names to the rows
rownames(myMatrix) <- c("X", "Y", "Z")
# Display the Matrix
print(myMatrix)
print("Access Element 5 at coordinate 2,2")
print(myMatrix[2,2])
print(myMatrix["Y","B"])
print("Access Element 8 at coordinate 2,3")
print(myMatrix[2,3])
print(myMatrix["Y","C"])
';
EXEC sp_execute_external_script  @language = N'R',
@script = @rscript;
GO
``` A Matrix can only have 2 dimensions, a three or more dimension data structure is an Array. Let's define a 2X5X4 Array and see how to assign and retrieve a value from it.

```# Create a multi dimensioanl arraymyarr = array(0.0, c(2,5,4)) # 2x5x4 n-array
print(myarr)  # 40 values displayed
#Assign a Value to location 2,3,4
myarr[2,3,4] <- 1
#D#Display Value at location 2,3,4
print(myarr)print(myarr[2,3,4])
```  ```  DECLARE @rscript NVARCHAR(MAX);
SET @rscript = N'
# Create a multi dimensioanl array
myarr = array(0.0, c(2,5,4)) # 2x5x4 n-array
print(myarr)  # 40 values displayed
#Assign a Value to location 2,3,4
myarr[2,3,4] <- 1
print(myarr)
#Display Value at location 2,3,4
print(myarr[2,3,4])
';
EXEC sp_execute_external_script
@language = N'R',
@script = @rscript;
GO
``` ## Data Frames

Data Frames play an important role in Data Science, let's see with an example how we can create and subsetting a simple Data Frame.

```# Define a Data Frame of mix Numbers and Letters
df <- data.frame(x = 1:5, z = letters[1:5],w = letters[6:10], y = 5:1 )
# Display the data frame contents
print(df)
``` ```print(df)
print("Display a row that have c as element in column z ")
df[df\$z == "c", ]
``` ```print(df)
print("Display a Single Value ")
df[df\$z == "c", "y"]
``` ```prprint("Display 1st and 5th rows")
df[c(1, 5), ]prprint("Display 2nd and 3th columns")
df[,c(2, 3)]``` ## Conclusion

In the tip we have learned subsetting of the main R data types. In the next tip we will see subsetting and assignment, lookup table, matching, merging and other R commands.

##### Next Steps
• The reader will need to install RStudio in order to test this tip.
• Check out these tips Matteo Lorini is a DBA and has been working in IT since 1993. He specializes in SQL Server and also has knowledge of MySQL.

View all my tips