Machine Learning with Tree-Based Models in Python
4.5+
32 reviewsIntermediate
In this course, you'll learn how to use tree-based models and ensembles for regression and classification using scikit-learn.
Start Course for Free5 Hours15 Videos57 Exercises72,626 Learners
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Loved by learners at thousands of companies
Course Description
Decision trees are supervised learning models used for problems involving classification and regression. Tree models present a high flexibility that comes at a price: on one hand, trees are able to capture complex non-linear relationships; on the other hand, they are prone to memorizing the noise present in a dataset. By aggregating the predictions of trees that are trained differently, ensemble methods take advantage of the flexibility of trees while reducing their tendency to memorize noise. Ensemble methods are used across a variety of fields and have a proven track record of winning many machine learning competitions.
In this course, you'll learn how to use Python to train decision trees and tree-based models with the user-friendly scikit-learn machine learning library. You'll understand the advantages and shortcomings of trees and demonstrate how ensembling can alleviate these shortcomings, all while practicing on real-world datasets. Finally, you'll also understand how to tune the most influential hyperparameters in order to get the most out of your models.
- 1
Classification and Regression Trees
FreeClassification and Regression Trees (CART) are a set of supervised learning models used for problems involving classification and regression. In this chapter, you'll be introduced to the CART algorithm.
Decision tree for classification50 xpTrain your first classification tree100 xpEvaluate the classification tree100 xpLogistic regression vs classification tree100 xpClassification tree Learning50 xpGrowing a classification tree50 xpUsing entropy as a criterion100 xpEntropy vs Gini index100 xpDecision tree for regression50 xpTrain your first regression tree100 xpEvaluate the regression tree100 xpLinear regression vs regression tree100 xp - 2
The Bias-Variance Tradeoff
The bias-variance tradeoff is one of the fundamental concepts in supervised machine learning. In this chapter, you'll understand how to diagnose the problems of overfitting and underfitting. You'll also be introduced to the concept of ensembling where the predictions of several models are aggregated to produce predictions that are more robust.
Generalization Error50 xpComplexity, bias and variance50 xpOverfitting and underfitting50 xpDiagnose bias and variance problems50 xpInstantiate the model100 xpEvaluate the 10-fold CV error100 xpEvaluate the training error100 xpHigh bias or high variance?50 xpEnsemble Learning50 xpDefine the ensemble100 xpEvaluate individual classifiers100 xpBetter performance with a Voting Classifier100 xp - 3
Bagging and Random Forests
Bagging is an ensemble method involving training the same algorithm many times using different subsets sampled from the training data. In this chapter, you'll understand how bagging can be used to create a tree ensemble. You'll also learn how the random forests algorithm can lead to further ensemble diversity through randomization at the level of each split in the trees forming the ensemble.
- 4
Boosting
Boosting refers to an ensemble method in which several models are trained sequentially with each model learning from the errors of its predecessors. In this chapter, you'll be introduced to the two boosting methods of AdaBoost and Gradient Boosting.
Adaboost50 xpDefine the AdaBoost classifier100 xpTrain the AdaBoost classifier100 xpEvaluate the AdaBoost classifier100 xpGradient Boosting (GB)50 xpDefine the GB regressor100 xpTrain the GB regressor100 xpEvaluate the GB regressor100 xpStochastic Gradient Boosting (SGB)50 xpRegression with SGB100 xpTrain the SGB regressor100 xpEvaluate the SGB regressor100 xp - 5
Model Tuning
The hyperparameters of a machine learning model are parameters that are not learned from data. They should be set prior to fitting the model to the training set. In this chapter, you'll learn how to tune the hyperparameters of a tree-based model using grid search cross validation.
Tuning a CART's Hyperparameters50 xpTree hyperparameters50 xpSet the tree's hyperparameter grid100 xpSearch for the optimal tree100 xpEvaluate the optimal tree100 xpTuning a RF's Hyperparameters50 xpRandom forests hyperparameters50 xpSet the hyperparameter grid of RF100 xpSearch for the optimal forest100 xpEvaluate the optimal forest100 xpCongratulations!50 xp
In the following tracks
Data Scientist with PythonData Scientist Professional with PythonMachine Learning Scientist with PythonCollaborators



Prerequisites
Supervised Learning with scikit-learnElie Kawerk
See MoreSenior Data Scientist
Elie is a data scientist with a background in computational quantum physics. His experience encompasses several industries including brick and mortar retail, e-commerce, entertainment, and quick-commerce. He uses a variety of tools and techniques such as machine learning, experimentation, and causal inference to drive business value. His work on a Word2vec-based recommender system has been featured in Amazon Web Service's blog. As a meetup organizer, Elie is passionate about teaching data science and mentoring new-entrants to the field. Elie holds a Phd in physics from Sorbonne University.
Don’t just take our word for it
*4.5from 32 reviews
72%
9%
16%
3%
0%
Sort by
- Joern B.17 days
This course is well organized and keeps the classification as well the regression in the focus. If you download the files you are able to get the same results as doing it online. Never had this before!!! Thanks to Elie Kawerk who set this course up. This course is cristal clear and transparent.
- Jonathan W.about 1 month
Great course. I especially like that the instructor used pictures to explain the structure/flow of different kinds of models. It made things a lot clearer. One thing is that I think it would be better if the instructor talks more about in which situations we should use which kinds of models, as such a variety of models in the course may seem confusing without specifying in which scenario should we use them.
- Kyaw A.about 1 month
Really, really great course.
- Sergio M.about 2 months
Excellent course!!!
- David R.about 2 months
Clear and helpful
Loading ...
"This course is well organized and keeps the classification as well the regression in the focus. If you download the files you are able to get the same results as doing it online. Never had this before!!! Thanks to Elie Kawerk who set this course up. This course is cristal clear and transparent."
Joern B.
"Really, really great course."
Kyaw A.
"Excellent course!!!"
Sergio M.
FAQs
Join over 11 million learners and start Machine Learning with Tree-Based Models in Python today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.