Skip to main content

course

Foundations of Inference in R

Intermediate

Updated 12/2024

Learn how to draw conclusions about a population from a sample of data via a process known as statistical inference.

Start course for free

Included for FreePremium or Teams

RProbability & Statistics4 hours17 videos58 exercises4,350 XP35,531Statement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies

Course Description

One of the foundational aspects of statistical analysis is inference, or the process of drawing conclusions about a larger population from a sample of data. Although counter intuitive, the standard practice is to attempt to disprove a research claim that is not of interest. For example, to show that one medical treatment is better than another, we can assume that the two treatments lead to equal survival rates only to then be disproved by the data. Additionally, we introduce the idea of a p-value, or the degree of disagreement between the data and the hypothesis. We also dive into confidence intervals, which measure the magnitude of the effect of interest (e.g. how much better one treatment is than another).

Prerequisites

Introduction to Regression in R Hypothesis Testing in R

1

Introduction to ideas of inference

Welcome to the course!

Hypotheses (1)

Hypotheses (2)

Randomized distributions

Working with the NHANES data

Calculating statistic of interest

Randomized data under null model of independence

Randomized statistics and dotplot

Randomization density

Using the randomization distribution

Do the data come from the population?

What can you conclude?

Study conclusions

2

Completing a randomization test: gender discrimination

Example: gender discrimination

Gender discrimination hypotheses

Summarizing gender discrimination

Step-by-step through the permutation

Randomizing gender discrimination

Distribution of statistics

Reflecting on analysis

Critical region

Two-sided critical region

How does sample size affect results?

Sample size in randomization distribution

Sample size for critical region

What is a p-value?

Calculating the p-values

Practice calculating p-values

Calculating two-sided p-values

Summary of gender discrimination

3

Hypothesis testing errors: opportunity cost

Example: opportunity cost

Summarizing opportunity cost (1)

Plotting opportunity cost

Randomizing opportunity cost

Summarizing opportunity cost (2)

Opportunity cost conclusion

Errors and their consequences

Different choice of error rate

Errors for two-sided hypotheses

p-value for two-sided hypotheses: opportunity costs

Summary of opportunity costs

4

Confidence intervals

Parameters and confidence intervals

What is the parameter?

Hypothesis test or confidence interval?

Bootstrapping

Resampling from a sample

Visualizing the variability of p-hat

Always resample the original number of observations

Variability in p-hat

Empirical Rule

Bootstrap t-confidence interval

Bootstrap percentile interval

Interpreting CIs and technical conditions

Sample size effects on bootstrap CIs

Sample proportion value effects on bootstrap CIs

Percentile effects on bootstrap CIs

Summary of statistical inference

Foundations of Inference in R

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review

Included withPremium or Teams

Join over 15 million learners and start Foundations of Inference in R today!

Create Your Free Account

Google LinkedIn Facebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.