Working with Categorical Data in Python
Learn how to manipulate and visualize categorical data using pandas and seaborn.
Commencer Le Cours Gratuitement4 heures15 vidéos52 exercices22 223 apprenantsDéclaration de réalisation
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.Formation de 2 personnes ou plus ?
Essayer DataCamp for BusinessApprécié par les apprenants de milliers d'entreprises
Description du cours
Being able to understand, use, and summarize non-numerical data—such as a person’s blood type or marital status—is a vital component of being a data scientist. In this course, you’ll learn how to manipulate and visualize categorical data using pandas and seaborn. Through hands-on exercises, you’ll get to grips with pandas' categorical data type, including how to create, delete, and update categorical columns. You’ll also work with a wide range of datasets including the characteristics of adoptable dogs, Las Vegas trip reviews, and census data to develop your skills at working with categorical data.
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
- 1
Introduction to Categorical Data
GratuitAlmost every dataset contains categorical information—and often it’s an unexplored goldmine of information. In this chapter, you’ll learn how pandas handles categorical columns using the data type category. You’ll also discover how to group data by categories to unearth great summary statistics.
Course introduction50 xpCategorical vs. numerical100 xpExploring a target variable100 xpOrdinal categorical variables100 xpCategorical data in pandas50 xpSetting dtypes and saving memory100 xpCreating a categorical pandas Series100 xpSetting dtype when reading data100 xpGrouping data by category in pandas50 xpCreate lots of groups50 xpSetting up a .groupby() statement100 xpUsing pandas functions effectively100 xp - 2
Categorical pandas Series
Now it’s time to learn how to set, add, and remove categories from a Series. You’ll also explore how to update, rename, collapse, and reorder categories, before applying your new skills to clean and access other data within your DataFrame.
Setting category variables50 xpSetting categories100 xpAdding categories100 xpRemoving categories100 xpUpdating categories50 xpCollapsing categories knowledge check50 xpRenaming categories100 xpCollapsing categories100 xpReordering categories50 xpReordering categories in a Series100 xpUsing .groupby() after reordering100 xpCleaning and accessing data50 xpCleaning variables100 xpAccessing and filtering data100 xp - 3
Visualizing Categorical Data
In this chapter, you’ll use the seaborn Python library to create informative visualizations using categorical data—including categorical plots (cat-plot), box plots, bar plots, point plots, and count plots. You’ll then learn how to visualize categorical columns and split data across categorical columns to visualize summary statistics of numerical columns.
Introduction to categorical plots using Seaborn50 xpBoxplot understanding50 xpCreating a box plot100 xpSeaborn bar plots50 xpCreating a bar plot100 xpOrdering categories100 xpBar plot using hue100 xpPoint and count plots50 xpCreating a point plot100 xpCreating a count plot100 xpReview catplot() types100 xpAdditional catplot() options50 xpOne visualization per group100 xpUpdating categorical plots100 xp - 4
Pitfalls and Encoding
Lastly, you’ll learn how to overcome the common pitfalls of using categorical data. You’ll also grow your data encoding skills as you are introduced to label encoding and one-hot encoding—perfect for helping you prepare your data for use in machine learning algorithms.
Categorical pitfalls50 xpMemory usage knowledge check50 xpOvercoming pitfalls: string issues100 xpOvercoming pitfalls: using NumPy arrays100 xpLabel encoding50 xpCreate a label encoding and map100 xpUsing saved mappings100 xpCreating a Boolean encoding100 xpOne-hot encoding50 xpOne-hot knowledge check50 xpOne-hot encoding specific columns100 xpWrap-up video50 xp
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
collaborateurs
prérequis
Data Manipulation with pandasKasey Jones
Voir PlusResearch Data Scientist
Qu’est-ce que les autres apprenants ont à dire ?
Inscrivez-vous 15 millions d’apprenants et commencer Working with Categorical Data in Python Aujourd’hui!
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.