Skip to main content
HomePythonDimensionality Reduction in Python

Dimensionality Reduction in Python

4.4+
11 reviews
Intermediate

Understand the concept of reducing dimensionality in your data, and master the techniques to do so in Python.

Start Course for Free
4 hours16 videos58 exercises
29,842 learnersTrophyStatement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
GroupTraining 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies


Course Description

High-dimensional datasets can be overwhelming and leave you not knowing where to start. Typically, you’d visually explore a new dataset first, but when you have too many dimensions the classical approaches will seem insufficient. Fortunately, there are visualization techniques designed specifically for high dimensional data and you’ll be introduced to these in this course. After exploring the data, you’ll often find that many features hold little information because they don’t show any variance or because they are duplicates of other features. You’ll learn how to detect these features and drop them from the dataset so that you can focus on the informative ones. In a next step, you might want to build a model on these features, and it may turn out that some don’t have any effect on the thing you’re trying to predict. You’ll learn how to detect and drop these irrelevant features too, in order to reduce dimensionality and thus complexity. Finally, you’ll learn how feature extraction techniques can reduce dimensionality for you through the calculation of uncorrelated principal components.
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more
Try DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Machine Learning Scientist in Python

Go To Track
  1. 1

    Exploring High Dimensional Data

    Free

    You'll be introduced to the concept of dimensionality reduction and will learn when an why this is important. You'll learn the difference between feature selection and feature extraction and will apply both techniques for data exploration. The chapter ends with a lesson on t-SNE, a powerful feature extraction technique that will allow you to visualize a high-dimensional dataset.

    Play Chapter Now
    Introduction
    50 xp
    Finding the number of dimensions in a dataset
    50 xp
    Removing features without variance
    100 xp
    Feature selection vs. feature extraction
    50 xp
    Visually detecting redundant features
    100 xp
    Advantage of feature selection
    50 xp
    t-SNE visualization of high-dimensional data
    50 xp
    t-SNE intuition
    50 xp
    Fitting t-SNE to the ANSUR data
    100 xp
    t-SNE visualisation of dimensionality
    100 xp
  2. 2

    Feature Selection I - Selecting for Feature Information

    In this first out of two chapters on feature selection, you'll learn about the curse of dimensionality and how dimensionality reduction can help you overcome it. You'll be introduced to a number of techniques to detect and remove features that bring little added value to the dataset. Either because they have little variance, too many missing values, or because they are strongly correlated to other features.

    Play Chapter Now
  3. 4

    Feature Extraction

    This chapter is a deep-dive on the most frequently used dimensionality reduction algorithm, Principal Component Analysis (PCA). You'll build intuition on how and why this algorithm is so powerful and will apply it both for data exploration and data pre-processing in a modeling pipeline. You'll end with a cool image compression use case.

    Play Chapter Now
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist in Python

Go To Track

datasets

ANSUR FemaleANSUR MaleDiabetesGrocery store salesBoston Public SchoolsPokemon

collaborators

Collaborator's avatar
Hadrien Lacroix
Collaborator's avatar
Hillary Green-Lerman
Collaborator's avatar
Chester Ismay
Jeroen Boeye HeadshotJeroen Boeye

Machine Learning Engineer @ Faktion

Jeroen is a machine learning engineer working at Faktion, an AI company from Belgium. He uses both R and Python for his analyses and has a PhD background in computational biology. His experience mostly lies in working with structured data, produced by sensors or digital processes.
See More

Don’t just take our word for it

*4.4
from 11 reviews
73%
0%
27%
0%
0%
Sort by
  • Freddy C.
    7 months

    It was a great course, I would use part of the content during my data analytics classes.

  • Bryce Y.
    about 1 year

    Very practical course with the right balance of breadth vs detail

  • HARPREET S.
    about 1 year

    concepts delivered

  • Ankush B.
    over 1 year

    Topics are very well explained in the course.

  • Swee M.
    almost 2 years

    Great

"It was a great course, I would use part of the content during my data analytics classes."

Freddy C.

"Very practical course with the right balance of breadth vs detail"

Bryce Y.

"concepts delivered"

HARPREET S.

FAQs

Join over 14 million learners and start Dimensionality Reduction in Python today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.