track
Professional Data Engineer in Python
Included withPremium or Teams
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?
Try DataCamp for BusinessLoved by learners at thousands of companies
Track Description
Professional Data Engineer in Python
Prerequisites
Data EngineerCourse
Discover modern data architecture's key components, from ingestion and serving to governance and orchestration.
Course
The Unix command line helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds.
Course
Learn the essentials of VMs, containers, Docker, and Kubernetes. Understand the differences to get started!
Course
This course introduces dbt for data modeling, transformations, testing, and building documentation.
Course
Discover the fundamental concepts of object-oriented programming (OOP), building custom classes and objects!
Course
Conquer NoSQL and supercharge data workflows. Learn Snowflake to work with big data, Postgres JSON for handling document data, and Redis for key-value data.
Course
In this Introduction to DevOps, you’ll master the DevOps basics and learn the key concepts, tools, and techniques to improve productivity.
Course
Master Python testing: Learn methods, create checks, and ensure error-free code with pytest and unittest.
Project
Sometimes, things that once worked perfectly suddenly hit a snag. Practice your knowledge of DataFrames to find the problem and fix it!
Course
Gain an introduction to Docker and discover its importance in the data professional’s toolkit. Learn about Docker containers, images, and more.
Chapter
In this chapter, you'll learn how Spark manages data and how can you read and write tables from Python.
Chapter
bonusManipulating data
In this chapter, you'll learn about the pyspark.sql module, which provides optimized data queries to your Spark session.
Chapter
This chapter introduces the exciting world of Big Data, as well as the various concepts and different frameworks for processing Big Data. You will understand why Apache Spark is considered the best framework for BigData.
Chapter
The main abstraction Spark provides is a resilient distributed dataset (RDD), which is the fundamental and backbone data type of this engine. This chapter introduces RDDs and shows how RDDs can be created and executed using RDD Transformations and Actions.
Chapter
In this chapter, you'll learn about Spark SQL which is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. This chapter shows how Spark SQL allows you to use DataFrames in Python.
Project
Step into a data engineer's shoes and master data cleaning with PySpark on an e-commerce orders dataset!
Chapter
In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.
Chapter
In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.
Course
Learn about the difference between batching and streaming, scaling streaming systems, and real-world applications.
Course
Master Apache Kafka! From core concepts to advanced architecture, learn to create, manage, and troubleshoot Kafka for real-world data streaming challenges!
Course
In this course, you will learn the fundamentals of Kubernetes and deploy and orchestrate containers using Manifests and kubectl instructions.
Resource
Understand how data engineering can impact your business.
Complete
Earn Statement of Accomplishment
Add this credential to your LinkedIn profile, resume, or CVShare it on social media and in your performance review
Included withPremium or Teams
Enroll nowJoin over 15 million learners and start Professional Data Engineer in Python today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.