The Wayback Machine - https://web.archive.org/web/20201111224525/https://github.com/topics/webscraping
Skip to content
#

webscraping

Here are 2,564 public repositories matching this topic...

aeksco
aeksco commented Apr 22, 2020

There's a warning note in README.md detailing:

Warning - the AnalyzeDocument process from AWS Textract costs $50 per 1,000 PDF pages. Be careful when deploying this CDK stack as you could unintentionally rack up an expensive AWS bill quickly if you're not paying attention.

This might not be enough - if a user finds this project and doesn't read the documentation, they could inadvertently

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
  • Updated Nov 13, 2019
  • C#

Improve this page

Add a description, image, and links to the webscraping topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the webscraping topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.