Exploratory Data Analysis in Python
Learn how to explore, visualize, and extract insights from data using exploratory data analysis (EDA) in Python.
Start Course for Free4 hours14 videos49 exercises
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?Try DataCamp For Business
Loved by learners at thousands of companies
Course Description
So you’ve got some interesting data - where do you begin your analysis? This course will cover the process of exploring and analyzing data, from understanding what’s included in a dataset to incorporating exploration findings into a data science workflow.
Using data on unemployment figures and plane ticket prices, you’ll leverage Python to summarize and validate data, calculate, identify and replace missing values, and clean both numerical and categorical values. Throughout the course, you’ll create beautiful Seaborn visualizations to understand variables and their relationships.
For example, you’ll examine how alcohol use and student performance are related. Finally, the course will show how exploratory findings feed into data science workflows by creating new features, balancing categorical features, and generating hypotheses from findings.
By the end of this course, you’ll have the confidence to perform your own exploratory data analysis (EDA) in Python.You’ll be able to explain your findings visually to others and suggest the next steps for gathering insights from your data!
Using data on unemployment figures and plane ticket prices, you’ll leverage Python to summarize and validate data, calculate, identify and replace missing values, and clean both numerical and categorical values. Throughout the course, you’ll create beautiful Seaborn visualizations to understand variables and their relationships.
For example, you’ll examine how alcohol use and student performance are related. Finally, the course will show how exploratory findings feed into data science workflows by creating new features, balancing categorical features, and generating hypotheses from findings.
By the end of this course, you’ll have the confidence to perform your own exploratory data analysis (EDA) in Python.You’ll be able to explain your findings visually to others and suggest the next steps for gathering insights from your data!
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and moreIn the following Tracks
- 1
Getting to Know a Dataset
FreeWhat's the best way to approach a new dataset? Learn to validate and summarize categorical and numerical data and create Seaborn visualizations to communicate your findings.
Initial exploration50 xpFunctions for initial exploration100 xpCounting categorical values100 xpGlobal unemployment in 2021100 xpData validation50 xpDetecting data types100 xpValidating continents100 xpValidating range100 xpData summarization50 xpSummaries with .groupby() and .agg()100 xpNamed aggregations100 xpVisualizing categorical summaries100 xp - 2
Data Cleaning and Imputation
Exploring and analyzing data often means dealing with missing values, incorrect data types, and outliers. In this chapter, you’ll learn techniques to handle these issues and streamline your EDA processes!
Addressing missing data50 xpDealing with missing data100 xpStrategies for remaining missing data100 xpImputing missing plane prices100 xpConverting and analyzing categorical data50 xpFinding the number of unique values100 xpFlight duration categories100 xpAdding duration categories100 xpWorking with numeric data50 xpFlight duration100 xpAdding descriptive statistics100 xpHandling outliers50 xpWhat to do with outliers100 xpIdentifying outliers100 xpRemoving outliers100 xp - 3
Relationships in Data
Variables in datasets don't exist in a vacuum; they have relationships with each other. In this chapter, you'll look at relationships across numerical, categorical, and even DateTime data, exploring the direction and strength of these relationships as well as ways to visualize them.
Patterns over time50 xpImporting DateTime data100 xpUpdating data type to DateTime100 xpVisualizing relationships over time100 xpCorrelation50 xpInterpreting a heatmap50 xpVisualizing variable relationships100 xpVisualizing multiple variable relationships100 xpFactor relationships and distributions50 xpCategorical data in scatter plots100 xpExploring with KDE plots100 xp - 4
Turning Exploratory Analysis into Action
Exploratory data analysis is a crucial step in the data science workflow, but it isn't the end! Now it's time to learn techniques and considerations you can use to successfully move forward with your projects after you've finished exploring!
Considerations for categorical data50 xpChecking for class imbalance100 xpCross-tabulation100 xpGenerating new features50 xpExtracting features for correlation100 xpCalculating salary percentiles100 xpCategorizing salaries100 xpGenerating hypotheses50 xpComparing salaries100 xpChoosing a hypothesis100 xpCongratulations50 xp
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and moreIn the following Tracks
collaborators
George Boorman
See MoreCurriculum Manager, DataCamp
George is a Curriculum Manager at DataCamp. He holds a PGDip in Exercise for Health and BSc (Hons) in Sports Science and has experience in project management across public health, applied research, and not-for-profit sectors. George is passionate about sports, tech for good, and all things data science.
Izzy Weber
See MoreData Coach at iO-Sphere
Izzy is a Data Coach at iO-Sphere. She discovered a love for data during her seven years as an accounting professor at the University of Washington. She holds a masters degree in Taxation and is a Certified Public Accountant. Her passion is making learning technical topics fun.
FAQs
Join over 14 million learners and start Exploratory Data Analysis in Python today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.