Getting data from the web: scraping
Location
Rockefeller Hall 203
Overview
- Define HTML and CSS selectors
- Introduce the
rvest
package - Demonstrate how to extract information from HTML pages
- Demonstrate how to extract tables and convert to data frames
- Practice scraping data
Before class
Class materials
- Web scraping
rvest
- Load the library (
library(rvest)
) demo("tripadvisor")
- scraping a Trip Advisor pagedemo("united")
- how to scrape a web page which requires a login- Scraping IMDB
What you need to do after class

Lecturer in Information Science