Case Study: Exploratory Data Analysis in R
Use data manipulation and visualization skills to explore the historical voting of the United Nations General Assembly.
Commencer Le Cours Gratuitement4 heures15 vidéos58 exercices54 157 apprenantsDéclaration de réalisation
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.Formation de 2 personnes ou plus ?
Essayer DataCamp for BusinessApprécié par les apprenants de milliers d'entreprises
Description du cours
Once you've started learning tools for data manipulation and visualization like dplyr and ggplot2, this course gives you a chance to use them in action on a real dataset. You'll explore the historical voting of the United Nations General Assembly, including analyzing differences in voting between countries, across time, and among international issues. In the process you'll gain more practice with the dplyr and ggplot2 packages, learn about the broom package for tidying model output, and experience the kind of start-to-finish exploratory analysis common in data science.
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Manipulation de données en R
Aller à la piste- 1
Data cleaning and summarizing with dplyr
GratuitThe best way to learn data wrangling skills is to apply them to a specific case study. Here you'll learn how to clean and filter the United Nations voting dataset using the dplyr package, and how to summarize it into smaller, interpretable units.
The United Nations Voting Dataset50 xpFiltering rows100 xpAdding a year column100 xpAdding a country column100 xpGrouping and summarizing50 xpSummarizing the full dataset100 xpSummarizing by year100 xpSummarizing by country100 xpSorting and filtering summarized data50 xpSorting by percentage of "yes" votes100 xpFiltering summarized output100 xp - 2
Data visualization with ggplot2
Once you've cleaned and summarized data, you'll want to visualize them to understand trends and extract insights. Here you'll use the ggplot2 package to explore trends in United Nations voting within each country over time.
Visualization with ggplot250 xpChoosing an aesthetic50 xpPlotting a line over time100 xpOther ggplot2 layers100 xpVisualizing by country50 xpSummarizing by year and country100 xpPlotting just the UK over time100 xpPlotting multiple countries100 xpFaceting by country50 xpFaceting the time series100 xpFaceting with free y-axis100 xpChoose your own countries100 xp - 3
Tidy modeling with broom
While visualization helps you understand one country at a time, statistical modeling lets you quantify trends across many countries and interpret them together. Here you'll learn to use the tidyr, purrr, and broom packages to fit linear models to each country, and understand and compare their outputs.
Linear regression50 xpLinear regression on the United States100 xpFinding the slope of a linear regression50 xpFinding the p-value of a linear regression50 xpTidying models with broom50 xpTidying a linear regression model100 xpCombining models for multiple countries100 xpNesting for multiple models50 xpNesting a data frame100 xpList columns100 xpUnnesting100 xpFitting multiple models50 xpPerforming linear regression on each nested dataset100 xpTidy each linear regression model100 xpUnnesting a data frame100 xpWorking with many tidy models50 xpFiltering model terms100 xpFiltering for significant countries100 xpSorting by slope100 xp - 4
Joining and tidying
In this chapter, you'll learn to combine multiple related datasets, such as incorporating information about each resolution's topic into your vote analysis. You'll also learn how to turn untidy data into tidy data, and see how tidy data can guide your exploration of topics and countries over time.
Joining datasets50 xpJoining datasets with inner_join100 xpFiltering the joined dataset100 xpVisualizing colonialism votes100 xpTidy data50 xpTidy data observations50 xpUsing gather to tidy a dataset100 xpRecoding the topics100 xpSummarize by country, year, and topic100 xpVisualizing trends in topics for one country100 xpTidy modeling by topic and country50 xpNesting by topic and country100 xpInterpreting tidy models100 xpSteepest trends by topic50 xpChecking models visually100 xpConclusion50 xp
Formation de 2 personnes ou plus ?
Donnez à votre équipe l’accès à la plateforme DataCamp complète, y compris toutes les fonctionnalités.Dans les titres suivants
Manipulation de données en R
Aller à la pistecollaborateurs
David Robinson
Voir PlusPrincipal Data Scientist at Heap
Qu’est-ce que les autres apprenants ont à dire ?
Inscrivez-vous 15 millions d’apprenants et commencer Case Study: Exploratory Data Analysis in R Aujourd’hui!
Créez votre compte gratuit
ou
En continuant, vous acceptez nos Conditions d'utilisation, notre Politique de confidentialité et le fait que vos données sont stockées aux États-Unis.