HomeTutorialsR Programming

# Mastering Data Structures in the R Programming Language

Read our comprehensive guide on how to work with data structures in R programming: vectors, lists, arrays, matrices, factors, and data frames.
May 2024  · 6 min read

R, a popular statistical programming language tailored for data wrangling, analysis, and visualization, is equipped with a range of data structures that are optimized for handling various types of data tasks effectively. Mastering R's diverse array of data structures is a gateway to unlocking its full potential and transforming data into compelling insights.

As we start, consider taking our Introduction to R Programming course, which lets you practice these concepts with real datasets. Let's dive in.

## What are Data Structures in the R Language?

Data structures in R help organize data for analysis. They can be simple, holding only one type of data, or complex, supporting diverse data types. These data structures are tailored to the specific needs that arise during data-driven projects. The primary data structures in R are vectors, lists, matrices, arrays, factors, and data frames.

Data structures in R

## The Different Kinds of Data Structures in R

Let's take a little time to familiarize ourselves with the data structures in R. In the process, we can familiarize ourselves with common R functions.

### Vectors in R

Vectors are the simplest form of data structure in R. They are a collection of elements of the same type, such as numeric, character, or logical.

R is designed to work well with vectors and is specially designed to work with what are called vectorized operations. This means you can apply a function to a vector without needing to loop through its elements explicitly. For example, adding two vectors together adds corresponding elements in a faster and more concise way than doing so element-wise through loops.

Here we create three vectors for numerical, character, and logical types using the function `c()`.

``````numeric_vector = c(10, 20, 30)character_vector = c("apple", "banana", "cherry")logical_vector = c(TRUE, FALSE, TRUE)

print(numeric_vector)print(character_vector)print(logical_vector)``````

Example vectors in R

You can access the elements of the vector using square brackets.

``````# Access the first
elementprint(numeric_vector[1])

# Access multiple elements
print(character_vector[c(1, 3)])``````

Accessing vector elements in R

You can also perform mathematical and logical operations with vectors.

``````# Adding a scalar value to the vector
print(numeric_vector + 2)

# Multiplying elements by a scalar
print(numeric_vector * 10)

# Perform logical operations - Check which elements are greater than 15
print(numeric_vector > 15)``````

Vector operations in R

Other important operations include summing, finding the mean, and finding the minimum and maximum values.

``````# Summation
print(sum(numeric_vector))

# Mean
print(mean(numeric_vector))

# Max and min
print(max(numeric_vector))
print(min(numeric_vector))``````

Output from basic R functions

### Matrices in R

Matrices are two-dimensional arrays that store data of a single type. They are particularly useful for mathematical computations. In R, you can create a matrix using the `matrix()` function.

``my_matrix <- matrix(1:9, nrow=3, ncol=3)print(my_matrix)``

Creating a matrix in R

You can access the elements, rows, columns, or subsets of the matrix using indices.

``````# Access the element in the first row and second column
print(my_matrix[1,2])

# Access the second row
print(my_matrix[2,])

# Access the third column
print(my_matrix[,3])``````

Accessing the elements of a matrix in R

You can also perform mathematical operations on matrices.

``````another_matrix <- matrix(9:1, nrow=3, ncol=3)

print(my_matrix + another_matrix)

# Element-wise subtraction
print(my_matrix - another_matrix)

# Element-wise multiplication
print(my_matrix * another_matrix)

# Element-wise division
print(my_matrix / another_matrix)``````

Mathematical operations on matrices in R

### Arrays in R

Arrays are an extension to matrices, and can have more than two dimensions, providing a way to store multidimensional data efficiently. You can create an array in R using the `array()`function.

The `dim = c(2, 2, 2)` argument sets the dimensions of the array. In this case, the dimensions indicate that the array should have three dimensions (a 3D array), structured as 2 rows, 2 columns, and 2 layers (or depth). The output of the code above is shown below. You can explore more on arrays in our Arrays in R tutorial.

``````my_array = array(1:8, dim = c(2, 2, 2))
print(my_array)``````

Creating an array in R

### Lists in R

Lists are versatile data structures in R that can hold a mix of objects of different types and sizes. You can create a list using the `list()` function.

``````my_list <- list(name="DataCamp", year=2024, scores=c(80, 90, 85), active=TRUE)

print(my_list)``````

The contents of a list in R

Once you’ve created the list, you can access its elements using either indices or the name.

``````print(my_list[[3]])
print(my_list\$name)
print(my_list\$scores)``````

Accessing the elements of a list in R

### Factors in R

Unlike vectors, matrices, or lists, factors do not define a structure of data storage but rather describe how data should be treated within these structures. So, factors can be thought of as a data type akin to integer types or character types.

I included factors in the list because they are crucial for handling categorical data within R’s ecosystem, especially in scenarios involving statistical techniques where the distinction between categorical and continuous variables is significant. This is different than Python which requires categorical features to be converted to numerical ones through dummy or one hot encoding.

You can create factors using the `factor()` function as shown below:

``````gender <- factor(c("male", "female", "female", "male"))
print(gender)``````

Factors in R

Factors are also stored as levels and can be ordered or unordered.

``````levels(gender) <- c("Female", "Male")
print(gender)``````

Factor levels in R

If you want to look at the distribution of the factor variable, you can use the `summary()` function. The output shows that there are two entries each for `Female` and `Male`, respectively.

``summary(gender)``

Summary in R

### Data frames in R

Data frames are the most popular and widely used data structure in R. They are especially convenient because they can contain different types of data across different columns. You can consider them to be similar to tables in a database or CSV files as they store and manage data in a two-dimensional, square or rectangular format.

You can create a data frame with the `data.frame()` command.

``````data_frame <- data.frame( Names = c("Kiran", "Ajey", "Carol"), Age = c(25, 30, 35), Gender = c("Female", "Male", "Female"))

# Printing the data frame
print(data_frame)``````

A data frame in R

You can access the contents of the data frame in several ways, either by columns or by rows.

``````# Prints all names - use the \$ sign
print(data_frame\$Names)

# Access the first row
print(data_frame[1, ])

# Access the second column
print(data_frame[, 2])``````

The elements of a data frame in R

Data frames are super useful in data wrangling and analysis because they support a range of operations like subsetting and sorting, and they provide an easy way to print descriptive statistics.

#### Subsetting a data frame in R

``````subset_female <- data_frame[data_frame\$Gender == 'Female', ]
print(subset_female)``````

A subset of a data frame in R

#### Sorting a data frame in R

``````# Sorting
data_frame <- data_frame[order(data_frame\$Age), ]
print(data_frame)``````

A sorted data frame in R

#### Summarizing a data frame in R

``````# Statistical summary
summary(data_frame)``````

Descriptive or statistical summary in R

You can see from the above outputs how easy it is to perform data frame operations in R. As you delve deeper into data analysis, you'll find that data frames are incredibly versatile and powerful for data manipulation, analysis, and visualization. If you are interested in exploring more functionalities, check out our Data Frames in R tutorial.

## Continuing with R

This tutorial has provided you with insights into the various data structures in R and their application in real-world data analysis situations. Mastering these structures will improve your analytical skills, enabling you to effectively manage and analyze data.

As you continue your journey, you will discover that R is a unique programming language renowned for its robust statistical capabilities and extensive libraries. It is a worthwhile investment for anyone curious about the world of data analysis.

Author
Vikash Singh
Topics

Learn R with DataCamp

Course

### .css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Introduction to R

4 hr
2.7M
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See Details
Start Course

Course

### Introduction to Regression in R

4 hr
50.7K
Predict housing prices and ad click-through rate by implementing, analyzing, and interpreting regression analysis in R.

Course

### Exploratory Data Analysis in R

4 hr
99.5K
Learn how to use graphical and numerical techniques to begin uncovering the structure of your data.
See More
Related

tutorial

### Data Types in R

Learn about data types and their importance in a programming language. More specifically, learn how to use various data types like vector, matrices, lists, and dataframes in the R programming language.

12 min

tutorial

### Utilities in R Tutorial

Learn about several useful functions for data structure manipulation, nested-lists, regular expressions, and working with times and dates in the R programming language.

18 min

tutorial

### Matrices in R Tutorial

Learn all about R's matrix, naming rows and columns, accessing elements also with computation like addition, subtraction, multiplication, and division.

Olivia Smith

7 min

tutorial

### Sorting Data in R

How to sort a data frame in R.

DataCamp Team

2 min

tutorial

### Introduction to Data frames in R

This tutorial takes course material from DataCamp's Introduction to R course and allows you to practice data frames.

Ryan Sheehy

5 min

tutorial

### Arrays in R

Learn about Arrays in R, including indexing with examples, along with the creation and addition of matrices and the apply() function.

Olivia Smith

8 min

See MoreSee More