project
Cleaning an Orders Dataset with PySpark
Advanced
Updated 07/2024Start Project for Free
Included withPremium or Teams
1 Task1,500 XP852
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?
Try DataCamp for BusinessProject Description
Data cleaning is an essential skill for any data professional.
In this project, you will step into a role of a data engineer at an e-commerce company and use PySpark, a powerful tool for data processing, to clean an orders dataset.
This hands-on experience will sharpen your ability to format, extract and amend data for further analysis.
Project Tasks
- 1Task 1
Rufat Mustafaev
See MoreData Scientist, Booking.com
Rufat is a data scientist at the global travel-tech leader. He has a background in Economics and has applied data science to solve complex problems in various industries including management consulting, credit risk, fintech and foodtech.