Skip to main content
HomeSpark

Building Recommendation Engines with PySpark

Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.

Start Course for Free
4 hours15 videos56 exercises12,213 learnersTrophyStatement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
Group

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies


Course Description

This course will show you how to build recommendation engines using Alternating Least Squares in PySpark. Using the popular MovieLens dataset and the Million Songs dataset, this course will take you step by step through the intuition of the Alternating Least Squares algorithm as well as the code to train, test and implement ALS models on various types of customer data.
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.
DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Big Data with PySpark

Go To Track
  1. 1

    Recommendations Are Everywhere

    Free

    This chapter will show you how powerful recommendations engines can be, and provide important distinctions between collaborative-filtering engines and content-based engines as well as the different types of implicit and explicit data that recommendation engines can use. You will also learn a very powerful way to uncover hidden features (latent features) that you may not even know exist in customer datasets.

    Play Chapter Now
    Why learn how to build recommendation engines?
    50 xp
    See the power of a recommendation engine
    100 xp
    Power of recommendation engines
    50 xp
    Recommendation engine types and data types
    50 xp
    Collaborative vs content-based filtering
    50 xp
    Collaborative vs content based filtering part II
    50 xp
    Implicit vs explicit data
    100 xp
    Ratings data types
    100 xp
    Uses for recommendation engines
    50 xp
    Alternate uses of recommendation engines.
    50 xp
    Confirm understanding of latent features
    100 xp
  2. 2

    How does ALS work?

    In this chapter you will review basic concepts of matrix multiplication and matrix factorization, and dive into how the Alternating Least Squares algorithm works and what arguments and hyperparameters it uses to return the best recommendations possible. You will also learn important techniques for properly preparing your data for ALS in Spark.

    Play Chapter Now
  3. 4

    What if you don't have customer ratings?

    In most real-life situations, you won't not have "perfect" customer data available to build an ALS model. This chapter will teach you how to use your customer behavior data to "infer" customer ratings and use those inferred ratings to build an ALS recommendation engine. Using the Million Songs Dataset as well as another version of the MovieLens dataset, this chapter will show you how to use the data available to you to build a recommendation engine using ALS and evaluate it's performance.

    Play Chapter Now
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

In the following Tracks

Big Data with PySpark

Go To Track

collaborators

Collaborator's avatar
Lore Dirick
Collaborator's avatar
Nick Solomon
Collaborator's avatar
Adrián Soto

prerequisites

Introduction to PySparkSupervised Learning with scikit-learn
Jamen Long HeadshotJamen Long

Data Scientist

See More

What do other learners have to say?

Join over 15 million learners and start Building Recommendation Engines with PySpark today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.