Accéder au contenu principal

cours

Building Recommendation Engines with PySpark

Avancé

Updated 12/2024

Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.

Commencer le cours gratuitement

Inclus gratuitementPremium or Teams

SparkMachine learning4 heures15 vidéos56 exercices4,550 XP12,287Déclaration de réalisation

Créez votre compte gratuit

Google LinkedIn Facebook

ou

En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.

Formation de 2 personnes ou plus ?

Essayer DataCamp for Business

Apprécié par les apprenants de milliers d’entreprises

Description du cours

This course will show you how to build recommendation engines using Alternating Least Squares in PySpark. Using the popular MovieLens dataset and the Million Songs dataset, this course will take you step by step through the intuition of the Alternating Least Squares algorithm as well as the code to train, test and implement ALS models on various types of customer data.

Conditions préalables

Introduction to PySpark Supervised Learning with scikit-learn

1

Recommendations Are Everywhere

Commencer le chapitre

Why learn how to build recommendation engines?

See the power of a recommendation engine

Power of recommendation engines

Recommendation engine types and data types

Collaborative vs content-based filtering

Collaborative vs content based filtering part II

Implicit vs explicit data

Ratings data types

Uses for recommendation engines

Alternate uses of recommendation engines.

Confirm understanding of latent features

2

How does ALS work?

Commencer le chapitre

Overview of matrix multiplication

Matrix multiplication

Matrix multiplication part II

Overview of matrix factorization

Matrix factorization

Non-negative matrix factorization

How ALS alternates to generate predictions

Estimating recommendations

RMSE as ALS alternates

Data preparation for Spark ALS

Correct format and distinct users

Assigning integer id's to movies

ALS parameters and hyperparameters

Build out an ALS model

Build RMSE evaluator

3

Recommending Movies

Commencer le chapitre

Introduction to the MovieLens dataset

Viewing the MovieLens Data

Calculate sparsity

The GroupBy and Filter methods

MovieLens Summary Statistics

View Schema

ALS model buildout on MovieLens Data

Create test/train splits and build your ALS model

Tell Spark how to tune your ALS model

Build your cross validation pipeline

Best Model and Best Model Parameters

Model Performance Evaluation

Generate predictions and calculate RMSE

Interpreting the RMSE

Do recommendations make sense

4

What if you don't have customer ratings?

Commencer le chapitre

Introduction to the Million Songs Dataset

Confirm understanding of implicit rating concepts

MSD summary statistics

Grouped summary statistics

Evaluating implicit ratings models

Specify ALS hyperparameters

Build implicit models

Running a cross-validated implicit ALS model

Extracting parameters

Overview of binary, implicit ratings

Binary model performance

Recommendations from binary data

Course recap

Building Recommendation Engines with PySpark

Cours
terminé

Earn Déclaration de réalisation

Ajoutez ces informations d’identification à votre profil LinkedIn, à votre CV ou à votre CV
Partagez-le sur les réseaux sociaux et dans votre évaluation de performance

Inclus avecPremium or Teams

S'inscrire maintenant

Inscrivez-vous 15 millions d’apprenants et commencer Building Recommendation Engines with PySpark Aujourd’hui!

Créez votre compte gratuit

Google LinkedIn Facebook

ou

En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.