As a data scientist, you will need to clean data, wrangle and munge it, visualize it, build predictive models and interpret these models. Before you can do so, however, you will need to know how to get data into Python. In the prequel to this course, you learned many ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL. In this course, you'll extend this knowledge base by learning to import data from the web and by pulling data from Application Programming Interfaces— APIs—such as the Twitter streaming API, which allows us to stream real-time tweets.
Importing data from the InternetFree
The web is a rich source of data from which you can extract various types of insights and findings. In this chapter, you will learn how to get data from the web, whether it is stored in files or in HTML. You'll also learn the basics of scraping and parsing web data.Importing flat files from the web50 xpImporting flat files from the web: your turn!100 xpOpening and reading flat files from the web100 xpImporting non-flat files from the web100 xpHTTP requests to import files from the web50 xpPerforming HTTP requests in Python using urllib100 xpPrinting HTTP request results in Python using urllib100 xpPerforming HTTP requests in Python using requests100 xpScraping the web in Python50 xpParsing HTML with BeautifulSoup100 xpTurning a webpage into data using BeautifulSoup: getting the text100 xpTurning a webpage into data using BeautifulSoup: getting the hyperlinks100 xp
Interacting with APIs to import data from the web
In this chapter, you will gain a deeper understanding of how to import data from the web. You will learn the basics of extracting data from APIs, gain insight on the importance of APIs, and practice extracting data by diving into the OMDB and Library of Congress APIs.Introduction to APIs and JSONs50 xpPop quiz: What exactly is a JSON?50 xpLoading and exploring a JSON100 xpPop quiz: Exploring your JSON50 xpAPIs and interacting with the world wide web50 xpPop quiz: What's an API?50 xpAPI requests100 xpJSON–from the web to Python100 xpChecking out the Wikipedia API100 xp
Diving deep into the Twitter API
In this chapter, you will consolidate your knowledge of interacting with APIs in a deep dive into the Twitter streaming API. You'll learn how to stream real-time Twitter data, and how to analyze and visualize it.
In the following tracksData EngineerData Scientist Professional with PythonImporting & Cleaning Data with Python
PrerequisitesIntroduction to Importing Data in Python
Hugo Bowne-AndersonSee More
Hugo is a data scientist, educator, writer and podcaster formerly at DataCamp. His main interests are promoting data & AI literacy, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. If you want to know what he likes to talk about, definitely check out DataFramed, the DataCamp podcast, which he hosted and produced.