#
wikipedia-scraper
Here are 63 public repositories matching this topic...
A 🤖 which provides features from Wikipedia like summary, title searches, location API etc.
python
heroku
flask
firebase
telegram
telegram-bot
wikipedia
webhook
chatbot
mit-license
telegram-bot-api
pytelegrambotapi
wikipedia-scraper
wikibot
telegram-userbot
rtdb
bot-commands
wikipedia-library
wiki-library
-
Updated
Jun 8, 2022 - Python
Java tool to get wikipedia data
-
Updated
Feb 5, 2022 - Java
SpaceX Launches 🚀 and Starlink Satellites 🛰
firebase
serverless
wikipedia
nextjs
mustache
google-cloud-platform
spacex
wikipedia-scraper
spacex-launches
starlink
-
Updated
Jul 7, 2022 - JavaScript
Just Refs - extract just the references and related topics from any page on the English Wikipedia
-
Updated
May 18, 2020 - PHP
This project collects Wikipedia articles from a search term entered by the user and formats the data into a .docx (Word Document) document with images related to each section of the collected article.
api
open-source
automation
robot
google-custom-search
wikipedia
microsoft-word
scraping
wikipedia-api
docx
ibm
google-cloud-platform
ibm-watson
docx-generator
algorithmia
wikipedia-scraper
video-maker
microsoft-word-automation
filipe-deschamps
-
Updated
Jun 22, 2022 - JavaScript
A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.
wikipedia
wikipedia-api
text-analytics
wikipedia-article
wikipedia-search
wikipedia-corpus
wikipedia-scraper
-
Updated
Jun 22, 2022 - Python
A NLP model I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.
nlp
wikipedia
cosine-similarity
text-analytics
nlp-machine-learning
similarity-score
wikipedia-scraper
nlp-model
nltk-similarity
corpus-similarity
-
Updated
May 19, 2020 - Python
Collects a multimodal dataset of Wikipedia articles and their images
database
wikipedia
wikipedia-api
wikipedia-bot
data-collection
wikipedia-page
wikipedia-viewer
data-processing
multimodality
wikipedia-dump
data-cleaning
wikipedia-search
wikipedia-corpus
wikipedia-scraper
wikipedia-entries
multimodal-learning
multimodal
multimodal-datasets
multimodal-data
multimodal-representation
-
Updated
May 26, 2022 - Python
Taxonomic trees (cladograms) from Wikipedia-scraped data.
-
Updated
Dec 22, 2021 - Python
Linked Data Knowledge Base Population (KBP) framework built on top of Snorkel. The default configuration uses Wikipedia as text corpus and DBpedia as target.
nlp
docker
natural-language-processing
linked-data
information-extraction
weak-supervision
linked-data-quality-assessment
relation-extraction
weakly-supervised-learning
distant-supervision
wikipedia-scraper
knowledge-base-population
knowledge-base-construction
-
Updated
Nov 23, 2019 - Python
Wikipedia Entities Lexicon Extractor
-
Updated
Oct 26, 2017 - Python
Wikipedia Scraper written in PHP
-
Updated
Feb 3, 2018 - PHP
Music tagger with GUI that parses wikipedia for information. Can also download album art and lyrics.
gui
pyqt5
music-information-retrieval
python-3
pyside2
console-application
pyinstaller
album-art
music-tagger
wikipedia-scraper
music-tagging
lyrics-fetcher
lyrics-search
-
Updated
Jun 8, 2020 - Python
Extracts geodata from a wikipedia dump
converter
json
geojson
mapping
wikipedia
conversion
geodata
geotagged-wikipedia-articles
wikipedia-dump
geotagging
wikipedia-scraper
-
Updated
Feb 12, 2020 - Go
Given a topic name this project finds you the most suitable parent topics for the topic, searching the Wikipedia Category Network (WCN) related to that topic by the help of a statistical approach. It helps you form an autogenerated topic tree for a given topic.
javascript
css
python
html
jquery
php
wikipedia
rest-api
ajax
wikipedia-api
html-css
statistical
wikipedia-scraper
wcn
-
Updated
Feb 10, 2018 - JavaScript
python3 spiderbot which scraps a given Wikipedia URL and stores the info from the article in a text file.
-
Updated
Jul 29, 2020 - Python
Scraping logos of world football clubs from wikipedia
-
Updated
Apr 3, 2018 - Python
A Wikipedia Web Scraper used to download all the text information in a .txt file.
-
Updated
Jun 3, 2019 - Python
A Haskell-powered Twitter bot that posts milestones and statistics of various Wikipedias.
-
Updated
Oct 27, 2020 - Haskell
Wikipedia Article Summarizer a simple Python project based on NLP techniques
python
nlp
machine-learning
natural-language-processing
jupyter
jupyter-notebook
python3
nltk
summarization
nlp-machine-learning
wikipedia-scraper
nltk-python
article-summarization
-
Updated
May 29, 2022 - Jupyter Notebook
A Python code for a Personal Assistant which performs various tasks. Search google,Wikipedia, and gives stock prices with Oil prices .
-
Updated
Sep 16, 2020 - Python
A minimally dependent Wikimedia CLI
-
Updated
Jul 1, 2022 - Python
Bot em Node.js que publica, diária e automaticamente, uma molécula de interesse farmacológico, bem como sua estrutura, informações e um link para a Wikipédia.
-
Updated
Jun 25, 2022 - JavaScript
-
Updated
Jul 25, 2021 - Python
A web extension that makes extracting, editing, and exporting Wikipedia references easy!
-
Updated
Dec 1, 2021 - JavaScript
Scraping Wikipedia using the python wrapper of Wikipedia's WikiMedia API
-
Updated
Sep 15, 2020 - Jupyter Notebook
Scrape soccer data from Wikipedia across various European football leagues and perform interactive data visualizations on it.
python
wikipedia
json-data
numpy
league
python3
scraped-data
european-football-leagues
seasons
interactive-visualizations
wikipedia-scraper
beautifulsoup4
plotly-python
soccer-data
cefpython3
data-visualizations
soccerdata-scraper
-
Updated
Nov 7, 2019 - Python
Improve this page
Add a description, image, and links to the wikipedia-scraper topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the wikipedia-scraper topic, visit your repo's landing page and select "manage topics."


The wikipedia api provides a section an id (1 upwards), it'd be nice if that would also be included.
Example: https://en.wikipedia.org/w/api.php?format=xml&action=parse&prop=sections&page=License