Introduction to the Tidyverse
Get started on the path to exploring and visualizing your own data with the tidyverse, a powerful and popular collection of data science tools within R.Start Course for Free
4 Hours16 Videos50 Exercises282,831 Learners
Create Your Free Account
Loved by learners at thousands of companies
This is an introduction to the programming language R, focused on a powerful set of tools known as the Tidyverse. You'll learn the intertwined processes of data manipulation and visualization using the tools dplyr and ggplot2. You'll learn to manipulate data by filtering, sorting, and summarizing a real dataset of historical country data in order to answer exploratory questions. You'll then learn to turn this processed data into informative line plots, bar plots, histograms, and more with the ggplot2 package. You’ll get a taste of the value of exploratory data analysis and the power of Tidyverse tools. This is a suitable introduction for those who have no previous experience in R and are interested in performing data analysis.
In this chapter, you'll learn to do three things with a table: filter for particular observations, arrange the observations in a desired order, and mutate to add or change a column. You'll see how each of these steps allows you to answer questions about your data.The gapminder dataset50 xpLoading the gapminder and dplyr packages100 xpUnderstanding a data frame50 xpThe filter verb50 xpFiltering for one year100 xpFiltering for one country and one year100 xpThe arrange verb50 xpArranging observations by life expectancy100 xpFiltering and arranging100 xpThe mutate verb50 xpUsing mutate to change or create a column100 xpCombining filter, mutate, and arrange100 xp
Often a better way to understand and present data as a graph. In this chapter, you'll learn the essential skills of data visualization using the ggplot2 package, and you'll see how the dplyr and ggplot2 packages work closely together to create informative graphs.Visualizing with ggplot250 xpVariable assignment100 xpComparing population and GDP per capita100 xpComparing population and life expectancy100 xpLog scales50 xpPutting the x-axis on a log scale100 xpPutting the x- and y- axes on a log scale100 xpAdditional aesthetics50 xpAdding color to a scatter plot100 xpAdding size and color to a plot100 xpFaceting50 xpCreating a subgraph for each continent100 xpFaceting by year100 xp
Grouping and summarizing
So far you've been answering questions about individual country-year pairs, but you may be interested in aggregations of the data, such as the average life expectancy of all countries within each year. Here you'll learn to use the group by and summarize verbs, which collapse large datasets into manageable summaries.The summarize verb50 xpSummarizing the median life expectancy100 xpSummarizing the median life expectancy in 1957100 xpSummarizing multiple variables in 1957100 xpThe group_by verb50 xpSummarizing by year100 xpSummarizing by continent100 xpSummarizing by continent and year100 xpVisualizing summarized data50 xpVisualizing median life expectancy over time100 xpVisualizing median GDP per capita per continent over time100 xpComparing median life expectancy and median GDP per continent in 2007100 xp
Types of visualizations
In this chapter, you'll learn how to create line plots, bar plots, histograms, and boxplots. You'll see how each plot requires different methods of data manipulation and preparation, and you’ll understand how each of these plot types plays a different role in data analysis.Line plots50 xpVisualizing median GDP per capita over time100 xpVisualizing median GDP per capita by continent over time100 xpBar plots50 xpVisualizing median GDP per capita by continent100 xpVisualizing GDP per capita by country in Oceania100 xpHistograms50 xpVisualizing population100 xpVisualizing population with x-axis on a log scale100 xpBoxplots50 xpComparing GDP per capita across continents100 xpAdding a title to your graph100 xpConclusion50 xp
In the following tracksData Analyst with RData Scientist with RData Scientist Professional with RR ProgrammerTidyverse Fundamentals with R
David RobinsonSee More
Principal Data Scientist at Heap
Dave is the Principal Data Scientist at Heap. He has worked as a data scientist at DataCamp and Stack Overflow, and received his PhD in Quantitative and Computational Biology from Princeton University. Follow him at @drob on Twitter or on his blog, Variance Explained.
Don’t just take our word for it
*4.8from 64 reviews
- Paweł K.4 days
The course is very clear and useful!
- Maria D.12 days
"Introduction to the tidyverse" was very important for me, I thought I could never learn the topics inside, but DataCamp made it easy and I completed the course in a couple of days
- Sarah N.18 days
The pace and level of detail was excellent. I've now signed up for more courses and am so excited to learn new skills on my own schedule!
- John G.21 days
Good course. I learned a lot
- Jana B.23 days
very good, highly understable
"The course is very clear and useful!"
""Introduction to the tidyverse" was very important for me, I thought I could never learn the topics inside, but DataCamp made it easy and I completed the course in a couple of days"
"The pace and level of detail was excellent. I've now signed up for more courses and am so excited to learn new skills on my own schedule!"
Join over 11 million learners and start Introduction to the Tidyverse today!
Create Your Free Account