Scalable Data Processing in R

Advanced

Updated 12/2024

Learn how to write scalable code for working with big data in R using the bigmemory and iotools packages.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Course Description

Datasets are often larger than available RAM, which causes problems for R programmers since by default all the variables are stored in memory. You’ll learn tools for processing, exploring, and analyzing data directly from disk. You’ll also implement the split-apply-combine approach and learn how to write scalable code using the bigmemory and iotools packages. In this course, you'll make use of the Federal Housing Finance Agency's data, a publicly available data set chronicling all mortgages that were held or securitized by both Federal National Mortgage Association (Fannie Mae) and Federal Home Loan Mortgage Corporation (Freddie Mac) from 2009-2015.

Prerequisites

Writing Efficient R Code

Working with increasingly large data sets

Start Chapter

What is Scalable Data Processing?

Course Description

Earn Statement of Accomplishment

Join over .css-nklxlk{color:var(--wf-brand--main, #03EF62);}15 million learners and start Scalable Data Processing in R today!

Create Your Free Account

Join over 15 million learners and start Scalable Data Processing in R today!