Skip to main content
HomeRDimensionality Reduction in R

Dimensionality Reduction in R

Learn dimensionality reduction techniques in R and master feature selection and extraction for your own data and models.

Start Course for Free
4 hours16 videos56 exercises

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
GroupTraining 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies


Course Description

Do you ever work with datasets with an overwhelming number of features? Do you need all those features? Which ones are the most important? In this course, you will learn dimensionality reduction techniques that will help you simplify your data and the models that you build with your data while maintaining the information in the original data and good predictive performance.

Why learn dimensionality reduction?



We live in the information age—an era of information overload. The art of extracting essential information from data is a marketable skill. Models train faster on reduced data. In production, smaller models mean faster response time. Perhaps most important, smaller data and models are often easier to understand. Dimensionality reduction is your Occam’s razor in data science.

What will you learn in this course?



The difference between feature selection and feature extraction! Using R, you will learn how to identify and remove features with low or redundant information, keeping the features with the most information. That’s feature selection. You will also learn how to extract combinations of features as condensed components that contain maximal information. That’s feature extraction!

But most importantly, using R’s new tidymodel package, you will use real-world data to build models with fewer features without sacrificing significant performance.
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more
Try DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Machine Learning Scientist in R

Go To Track
  1. 1

    Foundations of Dimensionality Reduction

    Free

    Prepare to simplify large data sets! You will learn about information, how to assess feature importance, and practice identifying low-information features. By the end of the chapter, you will understand the difference between feature selection and feature extraction—the two approaches to dimensionality reduction.

    Play Chapter Now
    Introduction to dimensionality reduction
    50 xp
    Dimensionality and feature information
    100 xp
    Mutual information features
    100 xp
    Information and feature importance
    50 xp
    Calculating root entropy
    100 xp
    Calculating child entropies
    100 xp
    Calculating information gain of color
    100 xp
    The Importance of Dimensionality Reduction in Data and Model Building
    50 xp
    Calculate possible combinations
    100 xp
    Curse of dimensionality, overfitting, and bias
    100 xp
  2. 4

    Feature Extraction and Model Performance

    In this final chapter, you'll gain a strong intuition of feature extraction by understanding how principal components extract and combine the most important information from different features. Then learn about and apply three types of feature extraction — principal component analysis (PCA), t-SNE, and UMAP. Discover how you can use these feature extraction methods as a preprocessing step in the tidymodels model-building process.

    Play Chapter Now
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist in R

Go To Track

collaborators

Collaborator's avatar
George Boorman
Collaborator's avatar
Jasmin Ludolf
Collaborator's avatar
Izzy Weber

prerequisites

Modeling with tidymodels in R
Matt Pickard HeadshotMatt Pickard

Owner, Pickard Predictives, LLC

Matt is an Associate Professor of Data and Analytics at Northern Illinois University. On the side, he does data analytics consulting and training as the owner of Pickard Predictives, LLC. He's happily married with four girls and a boy poodle.
See More

What do other learners have to say?

Join over 14 million learners and start Dimensionality Reduction in R today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.