Feature Engineering in R
Learn the principles of feature engineering for machine learning models and how to implement them using the R tidymodels framework.
Start Course for Free4 hours14 videos58 exercises
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?
Try DataCamp for BusinessLoved by learners at thousands of companies
Course Description
Discover Feature Engineering for Machine Learning
In this course, you’ll learn about feature engineering, which is at the heart of many times of machine learning models. As the performance of any model is a direct consequence of the features it’s fed, feature engineering places domain knowledge at the center of the process. You’ll become acquainted with principles of sound feature engineering, helping to reduce the number of variables where possible, making learning algorithms run faster, improving interpretability, and preventing overfitting.Implement Feature Engineering Techniques in R
You will learn how to implement feature engineering techniques using the R tidymodels framework, emphasizing the recipe package that will allow you to create, extract, transform, and select the best features for your model.Engineer Features and Build Better ML Models
When faced with a new dataset, you will be able to identify and select relevant features and disregard non-informative ones to make your model run faster without sacrificing accuracy. You will also become comfortable applying transformations and creating new features to make your models more efficient, interpretable, and accurate!Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.In the following Tracks
Machine Learning Scientist in R
Go To Track- 1
Introducing Feature Engineering
FreeRaw data does not always come in its best shape for analysis. In this opening chapter, you will get a first look at how to transform and create features that enhance your model's performance and interpretability.
What is feature engineering?50 xpA tentative model100 xpManually engineering a feature100 xpCreating new features using domain knowledge50 xpSetting up your data for analysis100 xpBuilding a workflow100 xpIncreasing the information content of raw data50 xpIdentifying missing values100 xpImputing missing values and creating dummy variables100 xpFitting and assessing the model100 xpPredicting hotel bookings100 xp - 2
Transforming Features
In this chapter, you’ll learn that, beyond manually transforming features, you can leverage tools from the tidyverse to engineer new variables programmatically. You’ll explore how this approach improves your models' reproducibility and is especially useful when handling datasets with many features.
Why transform existing features?50 xpGlancing at your data50 xpNormalizing and log-transforming100 xpFit and augment100 xpCustomize your model assessment100 xpCommon feature transformations50 xpCommon transformations50 xpPlain recipe100 xpBox-Cox transformation100 xpYeo-Johnson transformation100 xpAdvanced transformations50 xpBaseline100 xpstep_poly()100 xpstep_percentile()100 xpWho's staying?100 xp - 3
Extracting Features
You’ll now learn how models often benefit from reducing dimensionality and extracting features from high-dimensional data, including converting text data into numeric values, encoding categorical data, and ranking the predictive power of variables. You’ll explore methods including principal component analysis, kernel principal component analysis, numerical extraction from text, categorical encodings, and variable importance scores.
Reducing dimensionality50 xpPrepping the stage100 xpDigging into the structure50 xpPercent of variance explained100 xpVisualizing variance explained100 xpFeature hashing50 xpInvestigating education field100 xpInto the matrix100 xpExploring the hashing50 xpVisualizing the hashing100 xpEncoding categorical data using supervised learning50 xpSetting up your workflow100 xpFitting, augmenting, and assessing100 xpBinding models together100 xpVariable Importance50 xpCreate a workflow100 xpFit and augment100 xpWhich is the main predictor?100 xp - 4
Selecting Features
You’ll wrap up the course by learning about feature engineering and machine learning techniques. You’ll begin by focusing on the problems associated with using all available features in a model and the importance of identifying irrelevant and redundant features and learning to remove these features using embedded methods such as lasso and elastic-net. Next, you’ll explore shrinkage methods such as lasso, ridge, and elastic-net, which can be used to regularize feature weights or select features by setting coefficients to zero. Finally, you’ll finish by focusing on creating an end-to-end feature engineering workflow and reviewing and practicing the previously learned concepts and functions in a small project.
Reducing the model's features50 xpSifting through variable importance100 xpAssessing model performance using all available predictors100 xpBuilding a reduced model100 xpShrinkage methods50 xpManual regularization with Lasso100 xpTuning the penalty100 xpFinalizing the model100 xpPutting it all together50 xpPrep and split100 xpPreprocess100 xpModel100 xpAssess100 xpCongratulations!50 xp
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.In the following Tracks
Machine Learning Scientist in R
Go To Trackcollaborators
Jorge Zazueta
See MoreResearch Professor
Jorge Zazueta is the Head of the Modeling Group at the School of Economics, UASLP.
What do other learners have to say?
Join over 15 million learners and start Feature Engineering in R today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.