String Manipulation with stringr in R
Learn how to pull character strings apart, put them back together and use the stringr package.
Start Course for Free4 hours17 videos60 exercises30,841 learnersStatement of Accomplishment
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?
Try DataCamp for BusinessLoved by learners at thousands of companies
Course Description
Character strings can turn up in all stages of a data science project. You might have to clean messy string input before analysis, extract data that is embedded in text or automatically turn numeric results into a sentence to include in a report. Perhaps the strings themselves are the data of interest, and you need to detect and match patterns within them. This course will help you master these tasks by teaching you how to pull strings apart, put them back together and use stringr to detect, extract, match and split strings using regular expressions, a powerful way to express patterns.
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.In the following Tracks
R Developer
Go To TrackText Mining in R
Go To Track- 1
String basics
FreeYou'll start with some basics: how to enter strings in R, how to control how numbers are transformed to strings, and finally how to combine strings together to produce output that combines text and nicely formatted numbers.
Welcome!50 xpQuotes100 xpWhat you see isn't always what you have100 xpEscape sequences100 xpTurning numbers into strings50 xpUsing format() with numbers100 xpControlling other aspects of the string100 xpformatC()100 xpPutting strings together50 xpAnnotation of numbers100 xpA very simple table100 xpLet's order pizza!100 xp - 2
Introduction to stringr
Time to meet stringr! You'll start by learning about some stringr functions that are very similar to some base R functions, then how to detect specific patterns in strings, how to split strings apart and how to find and replace parts of strings.
Introducing stringr50 xpPutting strings together with stringr100 xpString length100 xpExtracting substrings100 xpHunting for matches50 xpDetecting matches100 xpSubsetting strings based on match100 xpCounting matches100 xpSplitting strings50 xpParsing strings into variables100 xpSome simple text statistics100 xpReplacing matches in strings50 xpReplacing to tidy strings100 xpReview100 xpFinal challenges100 xp - 3
Pattern matching with regular expressions
In this chapter you'll learn about regular expressions, a language for describing patterns in strings. By combining regular expressions with the stringr functions you'll greatly increase your power to manipulate strings.
Regular expressions50 xpMatching the start or end of the string100 xpMatching any character100 xpCombining with stringr functions100 xpMore regular expressions50 xpAlternation100 xpCharacter classes100 xpRepetition100 xpShortcuts50 xpHunting for phone numbers100 xpExtracting age and gender from accident narratives100 xpParsing age and gender into pieces100 xp - 4
More advanced matching and manipulation
Now for two advanced ways to use regular expressions along with stringr: selecting parts of a match (a.k.a capturing) and referring back to parts of a match (a.k.a back-referencing). You'll also learn to deal with and strings or patterns that contain Unicode characters (e.g. é).
Capturing50 xpCapturing parts of a pattern100 xpPulling out parts of a phone number100 xpExtracting age and gender again100 xpBackreferences50 xpUsing backreferences in patterns100 xpReplacing with regular expressions100 xpReplacing with backreferences100 xpUnicode and pattern matching50 xpMatching a specific code point or code groups100 xpMatching a single grapheme100 xp - 5
Case studies
Practice your string manipulation skills on a couple of case studies. You'll also learn a few new skills, reading strings into R and handling problems of case (e.g. A versus a).
Training 2 or more people?
Get your team access to the full DataCamp platform, including all the features.In the following Tracks
R Developer
Go To TrackText Mining in R
Go To Trackdatasets
DNA sequences from the genome of Yersinia pestisNarrativesAdverbsImportance of being earnestCat-related accidentscollaborators
prerequisites
Intermediate RCharlotte Wickham
See MoreAssistant Professor at Oregon State University
Charlotte is an Assistant Professor in the Department of Statistics at Oregon State University and an avid R programmer with a passion for teaching. Her interests lie in spatiotemporal data, statistical graphics and computing, and environmental statistics.
What do other learners have to say?
Join over 15 million learners and start String Manipulation with stringr in R today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.