Skip to main content
HomePython

course

Cleaning Data in Python

Intermediate
4.4+
61 reviews
Updated 12/2024
Learn to diagnose and treat dirty data and develop the skills needed to transform your raw data into accurate insights!
Start course for free

Included for FreePremium or Teams

PythonData Preparation4 hours13 videos44 exercises3,500 XP122,093Statement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
Group

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

Discover How to Clean Data in Python

It's commonly said that data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time analyzing it. Data cleaning is an essential step for every data scientist, as analyzing dirty data can lead to inaccurate conclusions.

In this course, you will learn how to identify, diagnose, and treat various data cleaning problems in Python, ranging from simple to advanced. You will deal with improper data types, check that your data is in the correct range, handle missing data, perform record linkage, and more!

Learn How to Clean Different Data Types

The first chapter of the course explores common data problems and how you can fix them. You will first understand basic data types and how to deal with them individually. After, you'll apply range constraints and remove duplicated data points.

The last chapter explores record linkage, a powerful tool to merge multiple datasets. You'll learn how to link records by calculating the similarity between strings. Finally, you'll use your new skills to join two restaurant review datasets into one clean master dataset.

Gain Confidence in Cleaning Data

By the end of the course, you will gain the confidence to clean data from various types and use record linkage to merge multiple datasets. Cleaning data is an essential skill for data scientists. If you want to learn more about cleaning data in Python and its applications, check out the following tracks: Data Scientist with Python and Importing & Cleaning Data with Python.

Prerequisites

Python ToolboxJoining Data with pandas
1

Common data problems

Start Chapter
2

Text and categorical data problems

Start Chapter
3

Advanced data problems

Start Chapter
4

Record linkage

Start Chapter
Cleaning Data in Python
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Enroll now

Don’t just take our word for it

*4.4
from 61 reviews
64%
23%
5%
7%
2%
  • Sue D.
    3 days

    A stunning course and awesome instructor!

  • JUDE A.
    16 days

    The course trash out all corners and methods for data cleaning with practical examples and exercises.

  • Vu H.
    about 1 month

    Some quite complicated techniques. Really enjoyed the course.

  • Ileana R.
    3 months

    Great course! Super clear and it has given me the mentality to check out the data first for all the little bits and pieces that might cause problems along the way.

  • LAURENT N.
    4 months

    t

"A stunning course and awesome instructor!"

Sue D.

"The course trash out all corners and methods for data cleaning with practical examples and exercises."

JUDE A.

"Some quite complicated techniques. Really enjoyed the course."

Vu H.

FAQs

Join over 15 million learners and start Cleaning Data in Python today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.