Skip to main content

course

Statistical Thinking in Python (Part 1)

Intermediate

4.6+

Updated 12/2024

Build the foundation you need to think statistically and to speak the language of your data.

Start course for free

Included for FreePremium or Teams

PythonProbability & Statistics3 hours18 videos61 exercises4,550 XP181,089Statement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

After all of the hard work of acquiring data and getting them into a form you can work with, you ultimately want to make clear, succinct conclusions from them. This crucial last step of a data analysis pipeline hinges on the principles of statistical inference. In this course, you will start building the foundation you need to think statistically, speak the language of your data, and understand what your data is telling you. The foundations of statistical thinking took decades to build, but can be grasped much faster today with the help of computers. With the power of Python-based tools, you will rapidly get up-to-speed and begin thinking statistically by the end of this course.

Prerequisites

1

Graphical Exploratory Data Analysis

Introduction to Exploratory Data Analysis

What is the goal of statistical inference?

Advantages of graphical EDA

Plotting a histogram

Plotting a histogram of iris data

Axis labels!

Adjusting the number of bins in a histogram

Plot all of your data: Bee swarm plots

Bee swarm plot

Interpreting a bee swarm plot

Plot all of your data: ECDFs

Computing the ECDF

Plotting the ECDF

Comparison of ECDFs

Onward toward the whole story!

2

Quantitative Exploratory Data Analysis

Introduction to summary statistics: The sample mean and median

Means and medians

Computing means

Percentiles, outliers, and box plots

Computing percentiles

Comparing percentiles to ECDF

Box-and-whisker plot

Variance and standard deviation

Computing the variance

The standard deviation and the variance

Covariance and the Pearson correlation coefficient

Scatter plots

Variance and covariance by looking

Computing the covariance

Computing the Pearson correlation coefficient

3

Thinking Probabilistically-- Discrete Variables

Probabilistic logic and statistical inference

What is the goal of statistical inference?

Why do we use the language of probability?

Random number generators and hacker statistics

Generating random numbers using the np.random module

The np.random module and Bernoulli trials

How many defaults might we expect?

Will the bank fail?

Probability distributions and stories: The Binomial distribution

Sampling out of the Binomial distribution

Plotting the Binomial PMF

Poisson processes and the Poisson distribution

Relationship between Binomial and Poisson distributions

How many no-hitters in a season?

Was 2015 anomalous?

4

Thinking Probabilistically-- Continuous Variables

Probability density functions

Interpreting PDFs

Interpreting CDFs

Introduction to the Normal distribution

The Normal PDF

The Normal CDF

The Normal distribution: Properties and warnings

Gauss and the 10 Deutschmark banknote

Are the Belmont Stakes results Normally distributed?

What are the chances of a horse matching or beating Secretariat's record?

The Exponential distribution

Matching a story and a distribution

Waiting for the next Secretariat

If you have a story, you can simulate it!

Distribution of no-hitters and cycles

Final thoughts

Statistical Thinking in Python (Part 1)

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Don’t just take our word for it

*4.6

from 31 reviews

71%

23%

6%

0%

0%

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

Miguel B.

12 days

Good

Abe A.

4 months

Amazing course, one of the best on DataCamp.

Vlad P.

12 months

The course provide fundamentals of key operations within EDA, using raw NumPy functions.

Rachel Z.

about 1 year

Great introductory course in statistics!

Vitalis A.

about 1 year

Great content

"Good"

Miguel B.

"Amazing course, one of the best on DataCamp."

Abe A.

"The course provide fundamentals of key operations within EDA, using raw NumPy functions."

Vlad P.

Join over 15 million learners and start Statistical Thinking in Python (Part 1) today!

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.