Machine Learning with caret in R
This course teaches the big ideas in machine learning like how to build and evaluate predictive models.
Commencer Le Cours Gratuitement4 heures24 vidéos88 exercices58 083 apprenantsDéclaration de réalisation
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.Formation de 2 personnes ou plus ?
Essayer DataCamp for BusinessApprécié par les apprenants de milliers d'entreprises
Description du cours
Machine learning is the study and application of algorithms that learn from and make predictions on data. From search results to self-driving cars, it has manifested itself in all areas of our lives and is one of the most exciting and fast growing fields of research in the world of data science. This course teaches the big ideas in machine learning: how to build and evaluate predictive models, how to tune them for optimal performance, how to preprocess data for better results, and much more. The popular caret R package, which provides a consistent interface to all of R's most powerful machine learning facilities, is used throughout the course.
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Principes fondamentaux de l'apprentissage automatique en R
Aller à la pisteScientifique en apprentissage automatique en R
Aller à la piste- 1
Regression Models: Fitting and Evaluating Their Performance
GratuitIn the first chapter of this course, you'll fit regression models with
train()
and evaluate their out-of-sample performance using cross-validation and root-mean-square error (RMSE).Welcome to the course50 xpIn-sample RMSE for linear regression50 xpIn-sample RMSE for linear regression on diamonds100 xpOut-of-sample error measures50 xpOut-of-sample RMSE for linear regression50 xpRandomly order the data frame100 xpTry an 80/20 split100 xpPredict on test set100 xpCalculate test set RMSE by hand100 xpComparing out-of-sample RMSE to in-sample RMSE50 xpCross-validation50 xpAdvantage of cross-validation50 xp10-fold cross-validation100 xp5-fold cross-validation100 xp5 x 5-fold cross-validation100 xpMaking predictions on new data100 xp - 2
Classification Models: Fitting and Evaluating Their Performance
In this chapter, you'll fit classification models with
train()
and evaluate their out-of-sample performance using cross-validation and area under the curve (AUC).Logistic regression on sonar50 xpWhy a train/test split?50 xpTry a 60/40 split100 xpFit a logistic regression model100 xpConfusion matrix50 xpConfusion matrix takeaways50 xpCalculate a confusion matrix100 xpCalculating accuracy50 xpCalculating true positive rate50 xpCalculating true negative rate50 xpClass probabilities and predictions50 xpProbabilities and classes50 xpTry another threshold100 xpFrom probabilites to confusion matrix100 xpIntroducing the ROC curve50 xpWhat's the value of a ROC curve?50 xpPlot an ROC curve100 xpArea under the curve (AUC)50 xpModel, ROC, and AUC50 xpCustomizing trainControl100 xpUsing custom trainControl100 xp - 3
Tuning Model Parameters to Improve Performance
In this chapter, you will use the
train()
function to tweak model parameters through cross-validation and grid search.Random forests and wine50 xpRandom forests vs. linear models50 xpFit a random forest100 xpExplore a wider model space50 xpAdvantage of a longer tune length50 xpTry a longer tune length100 xpCustom tuning grids50 xpAdvantages of a custom tuning grid50 xpFit a random forest with custom tuning100 xpIntroducing glmnet50 xpAdvantage of glmnet50 xpMake a custom trainControl100 xpFit glmnet with custom trainControl100 xpglmnet with custom tuning grid50 xpWhy a custom tuning grid?50 xpglmnet with custom trainControl and tuning100 xpInterpreting glmnet plots50 xp - 4
Preprocessing Data
In this chapter, you will practice using
train()
to preprocess data before fitting models, improving your ability to making accurate predictions.Median imputation50 xpMedian imputation vs. omitting rows50 xpApply median imputation100 xpKNN imputation50 xpComparing KNN imputation to median imputation50 xpUse KNN imputation100 xpCompare KNN and median imputation50 xpMultiple preprocessing methods50 xpOrder of operations50 xpCombining preprocessing methods100 xpHandling low-information predictors50 xpWhy remove near zero variance predictors?50 xpRemove near zero variance predictors100 xppreProcess() and nearZeroVar()50 xpFit model on reduced blood-brain data100 xpPrinciple components analysis (PCA)50 xpUsing PCA as an alternative to nearZeroVar()100 xp - 5
Selecting Models: A Case Study in Churn Prediction
In the final chapter of this course, you'll learn how to use
resamples()
to compare multiple models and select (or ensemble) the best one(s).Reusing a trainControl50 xpWhy reuse a trainControl?50 xpMake custom train/test indices100 xpReintroducing glmnet50 xpglmnet as a baseline model50 xpFit the baseline model100 xpReintroducing random forest50 xpRandom forest drawback50 xpRandom forest with custom trainControl100 xpComparing models50 xpMatching train/test indices50 xpCreate a resamples object100 xpMore on resamples50 xpCreate a box-and-whisker plot100 xpCreate a scatterplot100 xpEnsembling models100 xpSummary50 xp
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Principes fondamentaux de l'apprentissage automatique en R
Aller à la pisteScientifique en apprentissage automatique en R
Aller à la pistecollaborateurs
prérequis
Introduction to Regression in RZachary Deane-Mayer
Voir PlusVP, Data Science at DataRobot
Max Kuhn
Voir PlusSoftware Engineer at RStudio and creator of caret
Qu’est-ce que les autres apprenants ont à dire ?
Inscrivez-vous 15 millions d’apprenants et commencer Machine Learning with caret in R Aujourd’hui!
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.