Supervised Learning in R: Regression
In this course you will learn how to predict future events using linear regression, generalized additive models, random forests, and xgboost.
Commencer Le Cours Gratuitement4 heures19 vidéos65 exercices41 787 apprenantsDéclaration de réalisation
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.Formation de 2 personnes ou plus ?
Essayer DataCamp for BusinessApprécié par les apprenants de milliers d'entreprises
Description du cours
From a machine learning perspective, regression is the task of predicting numerical outcomes from various inputs. In this course, you'll learn about different regression models, how to train these models in R, how to evaluate the models you train and use them to make predictions.
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Principes fondamentaux de l'apprentissage automatique en R
Aller à la pisteScientifique en apprentissage automatique en R
Aller à la piste- 1
What is Regression?
GratuitIn this chapter we introduce the concept of regression from a machine learning point of view. We will present the fundamental regression method: linear regression. We will show how to fit a linear regression model and to make predictions from the model.
Welcome and Introduction50 xpIdentify the regression tasks50 xpLinear regression - the fundamental method50 xpCode a simple one-variable regression100 xpExamining a model100 xpPredicting once you fit a model50 xpPredicting from the unemployment model100 xpMultivariate linear regression (Part 1)100 xpMultivariate linear regression (Part 2)100 xpWrapping up linear regression50 xp - 2
Training and Evaluating Regression Models
Now that we have learned how to fit basic linear regression models, we will learn how to evaluate how well our models perform. We will review evaluating a model graphically, and look at two basic metrics for regression models. We will also learn how to train a model that will perform well in the wild, not just on training data. Although we will demonstrate these techniques using linear regression, all these concepts apply to models fit with any regression algorithm.
Evaluating a model graphically50 xpGraphically evaluate the unemployment model100 xpThe gain curve to evaluate the unemployment model100 xpRoot Mean Squared Error (RMSE)50 xpCalculate RMSE100 xpR-squared50 xpCalculate R-squared100 xpCorrelation and R-squared100 xpProperly Training a Model50 xpGenerating a random test/train split100 xpTrain a model using test/train split100 xpEvaluate a model using test/train split100 xpCreate a cross validation plan100 xpEvaluate a modeling procedure using n-fold cross-validation100 xp - 3
Issues to Consider
Before moving on to more sophisticated regression techniques, we will look at some other modeling issues: modeling with categorical inputs, interactions between variables, and when you might consider transforming inputs and outputs before modeling. While more sophisticated regression techniques manage some of these issues automatically, it's important to be aware of them, in order to understand which methods best handle various issues -- and which issues you must still manage yourself.
Categorical inputs50 xpExamining the structure of categorical inputs100 xpModeling with categorical inputs100 xpInteractions50 xpModeling an interaction100 xpModeling an interaction (2)100 xpTransforming the response before modeling50 xpRelative error100 xpModeling log-transformed monetary output100 xpComparing RMSE and root-mean-squared Relative Error100 xpTransforming inputs before modeling50 xpInput transforms: the "hockey stick"100 xpInput transforms: the "hockey stick" (2)100 xp - 4
Dealing with Non-Linear Responses
Now that we have mastered linear models, we will begin to look at techniques for modeling situations that don't meet the assumptions of linearity. This includes predicting probabilities and frequencies (values bounded between 0 and 1); predicting counts (nonnegative integer values, and associated rates); and responses that have a non-linear but additive relationship to the inputs. These algorithms are variations on the standard linear model.
Logistic regression to predict probabilities50 xpFit a model of sparrow survival probability100 xpPredict sparrow survival100 xpPoisson and quasipoisson regression to predict counts50 xpPoisson or quasipoisson50 xpFit a model to predict bike rental counts100 xpPredict bike rentals on new data100 xpVisualize the bike rental predictions100 xpGAM to learn non-linear transforms50 xpWriting formulas for GAM models50 xpWriting formulas for GAM models (2)50 xpModel soybean growth with GAM100 xpPredict with the soybean model on test data100 xp - 5
Tree-Based Methods
In this chapter we will look at modeling algorithms that do not assume linearity or additivity, and that can learn limited types of interactions among input variables. These algorithms are *tree-based* methods that work by combining ensembles of *decision trees* that are learned from the training data.
The intuition behind tree-based methods50 xpPredicting with a decision tree50 xpRandom forests50 xpBuild a random forest model for bike rentals100 xpPredict bike rentals with the random forest model100 xpVisualize random forest bike model predictions100 xpOne-Hot-Encoding Categorical Variables50 xpvtreat on a small example100 xpNovel levels100 xpvtreat the bike rental data100 xpGradient boosting machines50 xpFind the right number of trees for a gradient boosting machine100 xpFit an xgboost bike rental model and predict100 xpEvaluate the xgboost bike rental model100 xpVisualize the xgboost bike rental model100 xp
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Principes fondamentaux de l'apprentissage automatique en R
Aller à la pisteScientifique en apprentissage automatique en R
Aller à la pistecollaborateurs
prérequis
Introduction to Regression in RNina Zumel
Voir PlusCo-founder, Principal Consultant at Win-Vector, LLC
John Mount
Voir PlusCo-founder, Principal Consultant at Win-Vector, LLC
Qu’est-ce que les autres apprenants ont à dire ?
Inscrivez-vous 15 millions d’apprenants et commencer Supervised Learning in R: Regression Aujourd’hui!
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.