Working with Categorical Data in Python
Learn how to manipulate and visualize categorical data using pandas and seaborn.
Kurs Kostenlos Starten4 Stunden15 Videos52 Übungen22.225 LernendeLeistungsnachweis
Kostenloses Konto erstellen
oder
Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.Trainierst du 2 oder mehr?
Versuchen DataCamp for BusinessBeliebt bei Lernenden in Tausenden Unternehmen
Kursbeschreibung
Being able to understand, use, and summarize non-numerical data—such as a person’s blood type or marital status—is a vital component of being a data scientist. In this course, you’ll learn how to manipulate and visualize categorical data using pandas and seaborn. Through hands-on exercises, you’ll get to grips with pandas' categorical data type, including how to create, delete, and update categorical columns. You’ll also work with a wide range of datasets including the characteristics of adoptable dogs, Las Vegas trip reviews, and census data to develop your skills at working with categorical data.
Trainierst du 2 oder mehr?
Verschaffen Sie Ihrem Team Zugriff auf die vollständige DataCamp-Plattform, einschließlich aller Funktionen.In den folgenden Tracks
- 1
Introduction to Categorical Data
KostenlosAlmost every dataset contains categorical information—and often it’s an unexplored goldmine of information. In this chapter, you’ll learn how pandas handles categorical columns using the data type category. You’ll also discover how to group data by categories to unearth great summary statistics.
Course introduction50 xpCategorical vs. numerical100 xpExploring a target variable100 xpOrdinal categorical variables100 xpCategorical data in pandas50 xpSetting dtypes and saving memory100 xpCreating a categorical pandas Series100 xpSetting dtype when reading data100 xpGrouping data by category in pandas50 xpCreate lots of groups50 xpSetting up a .groupby() statement100 xpUsing pandas functions effectively100 xp - 2
Categorical pandas Series
Now it’s time to learn how to set, add, and remove categories from a Series. You’ll also explore how to update, rename, collapse, and reorder categories, before applying your new skills to clean and access other data within your DataFrame.
Setting category variables50 xpSetting categories100 xpAdding categories100 xpRemoving categories100 xpUpdating categories50 xpCollapsing categories knowledge check50 xpRenaming categories100 xpCollapsing categories100 xpReordering categories50 xpReordering categories in a Series100 xpUsing .groupby() after reordering100 xpCleaning and accessing data50 xpCleaning variables100 xpAccessing and filtering data100 xp - 3
Visualizing Categorical Data
In this chapter, you’ll use the seaborn Python library to create informative visualizations using categorical data—including categorical plots (cat-plot), box plots, bar plots, point plots, and count plots. You’ll then learn how to visualize categorical columns and split data across categorical columns to visualize summary statistics of numerical columns.
Introduction to categorical plots using Seaborn50 xpBoxplot understanding50 xpCreating a box plot100 xpSeaborn bar plots50 xpCreating a bar plot100 xpOrdering categories100 xpBar plot using hue100 xpPoint and count plots50 xpCreating a point plot100 xpCreating a count plot100 xpReview catplot() types100 xpAdditional catplot() options50 xpOne visualization per group100 xpUpdating categorical plots100 xp - 4
Pitfalls and Encoding
Lastly, you’ll learn how to overcome the common pitfalls of using categorical data. You’ll also grow your data encoding skills as you are introduced to label encoding and one-hot encoding—perfect for helping you prepare your data for use in machine learning algorithms.
Categorical pitfalls50 xpMemory usage knowledge check50 xpOvercoming pitfalls: string issues100 xpOvercoming pitfalls: using NumPy arrays100 xpLabel encoding50 xpCreate a label encoding and map100 xpUsing saved mappings100 xpCreating a Boolean encoding100 xpOne-hot encoding50 xpOne-hot knowledge check50 xpOne-hot encoding specific columns100 xpWrap-up video50 xp
Trainierst du 2 oder mehr?
Verschaffen Sie Ihrem Team Zugriff auf die vollständige DataCamp-Plattform, einschließlich aller Funktionen.In den folgenden Tracks
Mitwirkende
Voraussetzungen
Data Manipulation with pandasKasey Jones
Mehr AnzeigenResearch Data Scientist
Was sagen andere Lernende?
Melden Sie sich an 15 Millionen Lernende und starten Sie Working with Categorical Data in Python Heute!
Kostenloses Konto erstellen
oder
Durch Klick auf die Schaltfläche akzeptierst du unsere Nutzungsbedingungen, unsere Datenschutzrichtlinie und die Speicherung deiner Daten in den USA.