Direkt zum Inhalt

Startseite Python

Kurs

Preprocessing for Machine Learning in Python

Fortgeschrittener Anfänger

Updated 12.2024

Learn how to clean and prepare your data for machine learning!

Kurs kostenlos starten

Kostenlos inbegriffenPremium or Teams

PythonMachine Learning4 Stunden20 Videos62 Übungen4,700 XP51,940Leistungsnachweis

Kostenloses Konto erstellen

Google LinkedIn Facebook

oder

Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.

Trainierst du 2 oder mehr?

Versuchen DataCamp for Business

Beliebt bei Lernenden in Tausenden Unternehmen

Kursbeschreibung

This course covers the basics of how and when to perform data preprocessing. This essential step in any machine learning project is when you get your data ready for modeling. Between importing and cleaning your data and fitting your machine learning model is when preprocessing comes into play. You'll learn how to standardize your data so that it's in the right form for your model, create new features to best leverage the information in your dataset, and select the best features to improve your model fit. Finally, you'll have some practice preprocessing by getting a dataset on UFO sightings ready for modeling.

Voraussetzungen

Cleaning Data in Python Supervised Learning with scikit-learn

1

Introduction to Data Preprocessing

Kapitel starten

Introduction to preprocessing

Exploring missing data

Dropping missing data

Working with data types

Exploring data types

Converting a column type

Training and test sets

Class imbalance

Stratified sampling

2

Standardizing Data

Kapitel starten

Standardization

When to standardize

Modeling without normalizing

Log normalization

Checking the variance

Log normalization in Python

Scaling data for feature comparison

Scaling data - investigating columns

Scaling data - standardizing columns

Standardized data and modeling

KNN on non-scaled data

KNN on scaled data

3

Feature Engineering

Kapitel starten

Feature engineering

Feature engineering knowledge test

Identifying areas for feature engineering

Encoding categorical variables

Encoding categorical variables - binary

Encoding categorical variables - one-hot

Engineering numerical features

Aggregating numerical features

Extracting datetime components

Engineering text features

Extracting string patterns

Vectorizing text

Text classification using tf/idf vectors

4

Selecting Features for Modeling

Kapitel starten

Feature selection

When to use feature selection

Identifying areas for feature selection

Removing redundant features

Selecting relevant features

Checking for correlated features

Selecting features using text vectors

Exploring text vectors, part 1

Exploring text vectors, part 2

Training Naive Bayes with feature selection

Dimensionality reduction

Training a model with PCA

5

Putting It All Together

Kapitel starten

UFOs and preprocessing

Checking column types

Dropping missing data

Categorical variables and standardization

Extracting numbers from strings

Identifying features for standardization

Engineering new features

Encoding categorical variables

Features from dates

Text vectorization

Feature selection and modeling

Selecting the ideal dataset

Modeling the UFO dataset, part 1

Modeling the UFO dataset, part 2

Congratulations!

Preprocessing for Machine Learning in Python

Kurs
abgeschlossen

Leistungsnachweis verdienen

Fügen Sie diese Anmeldeinformationen zu Ihrem LinkedIn-Profil, Lebenslauf oder Lebenslauf hinzu
Teilen Sie es in den sozialen Medien und in Ihrer Leistungsbeurteilung

Im Lieferumfang enthaltenPremium or Teams

Machen Sie mit 15 Millionen Lernende und starten Sie Preprocessing for Machine Learning in Python Heute!

Kostenloses Konto erstellen

Google LinkedIn Facebook

oder

Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.