Skip to main content

course

Machine Learning with caret in R

Intermediate

4.5+

Updated 12/2024

This course teaches the big ideas in machine learning like how to build and evaluate predictive models.

Start course for free

Included for FreePremium or Teams

RMachine Learning4 hours24 videos88 exercises6,200 XP58,225Statement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

Machine learning is the study and application of algorithms that learn from and make predictions on data. From search results to self-driving cars, it has manifested itself in all areas of our lives and is one of the most exciting and fast growing fields of research in the world of data science. This course teaches the big ideas in machine learning: how to build and evaluate predictive models, how to tune them for optimal performance, how to preprocess data for better results, and much more. The popular caret R package, which provides a consistent interface to all of R's most powerful machine learning facilities, is used throughout the course.

Prerequisites

Introduction to Regression in R

1

Regression Models: Fitting and Evaluating Their Performance

Welcome to the course

In-sample RMSE for linear regression

In-sample RMSE for linear regression on diamonds

Out-of-sample error measures

Out-of-sample RMSE for linear regression

Randomly order the data frame

Try an 80/20 split

Predict on test set

Calculate test set RMSE by hand

Comparing out-of-sample RMSE to in-sample RMSE

Cross-validation

Advantage of cross-validation

10-fold cross-validation

5-fold cross-validation

5 x 5-fold cross-validation

Making predictions on new data

2

Classification Models: Fitting and Evaluating Their Performance

3

Tuning Model Parameters to Improve Performance

Random forests and wine

Random forests vs. linear models

Fit a random forest

Explore a wider model space

Advantage of a longer tune length

Try a longer tune length

Custom tuning grids

Advantages of a custom tuning grid

Fit a random forest with custom tuning

Introducing glmnet

Advantage of glmnet

Make a custom trainControl

Fit glmnet with custom trainControl

glmnet with custom tuning grid

Why a custom tuning grid?

glmnet with custom trainControl and tuning

Interpreting glmnet plots

4

Preprocessing Data

Median imputation

Median imputation vs. omitting rows

Apply median imputation

KNN imputation

Comparing KNN imputation to median imputation

Use KNN imputation

Compare KNN and median imputation

Multiple preprocessing methods

Order of operations

Combining preprocessing methods

Handling low-information predictors

Why remove near zero variance predictors?

Remove near zero variance predictors

preProcess() and nearZeroVar()

Fit model on reduced blood-brain data

Principle components analysis (PCA)

Using PCA as an alternative to nearZeroVar()

5

Selecting Models: A Case Study in Churn Prediction

Reusing a trainControl

Why reuse a trainControl?

Make custom train/test indices

Reintroducing glmnet

glmnet as a baseline model

Fit the baseline model

Reintroducing random forest

Random forest drawback

Random forest with custom trainControl

Comparing models

Matching train/test indices

Create a resamples object

More on resamples

Create a box-and-whisker plot

Create a scatterplot

Ensembling models

Machine Learning with caret in R

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Don’t just take our word for it

*4.5

from 18 reviews

72%

17%

6%

6%

0%

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

Andrex M.

30 days

Fantastic! It was better than my graduate course.

Lorenzo G.

11 months

the best among the modules I had followed since I subscribed data camp

Elvis T.

12 months

Concepts well explaind and supported by practical examples in R. Beginner will find the course very help. Caret is a great package

Emily P.

about 1 year

Awesome. Lots of great info.

PAUL P.

over 1 year

Great course. The explanations are clear.

"Fantastic! It was better than my graduate course."

Andrex M.

"the best among the modules I had followed since I subscribed data camp"

Lorenzo G.

"Concepts well explaind and supported by practical examples in R. Beginner will find the course very help. Caret is a great package"

Elvis T.

Join over 15 million learners and start Machine Learning with caret in R today!

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.