Web Scraping Made Easy with Scrapy
This workshop aims to demonstrate how web scraping task can be made easy with Scrapy. Scrapy is an open source web scraping framework written in Python. It allows developers to focus on developing web crawlers without being bothered by lower-level details such as managing HTTP request scheduling and concurrency. We will use Scrapy to extract data from toscrape.com, a web scraping sandbox that can be used by anyone to learn web scraping. Participants will gradually learn how to perform web scraping, starting from simple task like extracting data from a single web page to more complex tasks such as extracting data from AJAX endpoints.
The target participants of this workshop are individuals with basic programming skill (not necessarily in Python) who understand basic concepts of HTTP and HTML document structure.