Scrapy project
Grow your team on GitHub
GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Sign upRepositories
-
scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
-
parsel
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
-
-
itemadapter
Common interface for data container classes
-
scrapy-bench
A CLI for benchmarking Scrapy.
-
itemloaders
Library to populate items using XPath and CSS with a convenient API
-
queuelib
Collection of persistent (disk-based) queues
-
-
scrapyd
A service daemon to run Scrapy spiders
-
quotesbot
This is a sample Scrapy project for educational purposes
-
scrapy-itemloader Archived
[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API
-
protego
A pure-Python robots.txt parser with support for modern conventions.
-
scrapyd-client
Command line client for Scrapyd server
-
scrapely
A pure-python HTML screen-scraping library
-
loginform
Fill HTML login forms automatically
-
url-chromium
url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url
-
base-chromium
base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/
-
dirbot
Scrapy project to scrape public web directories (educational) [DEPRECATED]
-
scrapy-bench-speedcenter
Forked from Parth-Vader/scrapy-bench-speedcenterCodespeed for scrapy-bench
-
pypydispatcher
A fork of http://pydispatcher.sourceforge.net/ with PyPy support
-
gsoc2014-integration-tests
GSoC2014 - Scrapy Integration tests project