The Wayback Machine - https://web.archive.org/web/20220725172608/https://github.com/topics/wikipedia-scraper

#

wikipedia-scraper

Here are 63 public repositories matching this topic...

martin-majlis / Wikipedia-API

Open

Section ids missing

1

kittenswolf commented Apr 23, 2018

The wikipedia api provides a section an id (1 upwards), it'd be nice if that would also be included.

Example: https://en.wikipedia.org/w/api.php?format=xml&action=parse&prop=sections&page=License

Read more

good first issue

Open

Way to get plain Wikitext of page?

2

themagicalmammal / wikibot

A 🤖 which provides features from Wikipedia like summary, title searches, location API etc.

python heroku flask firebase telegram telegram-bot wikipedia webhook chatbot mit-license telegram-bot-api pytelegrambotapi wikipedia-scraper wikibot telegram-userbot rtdb bot-commands wikipedia-library wiki-library

Updated Jun 8, 2022
Python

viralvaghela / Jwiki

Java tool to get wikipedia data

java wikipedia wikipedia-api data-gathering wikipedia-scraper javatool javawikipeda

Updated Feb 5, 2022
Java

moesalih / spacex.moesalih.com

SpaceX Launches 🚀 and Starlink Satellites 🛰

firebase serverless wikipedia nextjs mustache google-cloud-platform spacex wikipedia-scraper spacex-launches starlink

Updated Jul 7, 2022
JavaScript

attogram / justrefs

Sponsor

Just Refs - extract just the references and related topics from any page on the English Wikipedia

wikipedia information-extraction wikipedia-api data-extraction wikipedia-viewer wikipedia-scraper

Updated May 18, 2020
PHP

ThiagoNelsi / wikipedia-to-document

This project collects Wikipedia articles from a search term entered by the user and formats the data into a .docx (Word Document) document with images related to each section of the collected article.

api open-source automation robot google-custom-search wikipedia microsoft-word scraping wikipedia-api docx ibm google-cloud-platform ibm-watson docx-generator algorithmia wikipedia-scraper video-maker microsoft-word-automation filipe-deschamps

Updated Jun 22, 2022
JavaScript

kohjiaxuan / Wikipedia-Article-Scraper

A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.

wikipedia wikipedia-api text-analytics wikipedia-article wikipedia-search wikipedia-corpus wikipedia-scraper

Updated Jun 22, 2022
Python

kohjiaxuan / NLP-Model-for-Corpus-Similarity

A NLP model I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.

nlp wikipedia cosine-similarity text-analytics nlp-machine-learning similarity-score wikipedia-scraper nlp-model nltk-similarity corpus-similarity

Updated May 19, 2020
Python

OlehOnyshchak / pyWikiMM

Collects a multimodal dataset of Wikipedia articles and their images

database wikipedia wikipedia-api wikipedia-bot data-collection wikipedia-page wikipedia-viewer data-processing multimodality wikipedia-dump data-cleaning wikipedia-search wikipedia-corpus wikipedia-scraper wikipedia-entries multimodal-learning multimodal multimodal-datasets multimodal-data multimodal-representation

Updated May 26, 2022
Python

shanedrabing / taxopedia

Taxonomic trees (cladograms) from Wikipedia-scraped data.

taxonomy wikipedia phylogenetic-trees phylogenetics wikipedia-scraper cladogram taxonomic-trees

Updated Dec 22, 2021
Python

lorenzoranucci / sentimantic

Linked Data Knowledge Base Population (KBP) framework built on top of Snorkel. The default configuration uses Wikipedia as text corpus and DBpedia as target.

nlp docker natural-language-processing linked-data information-extraction weak-supervision linked-data-quality-assessment relation-extraction weakly-supervised-learning distant-supervision wikipedia-scraper knowledge-base-population knowledge-base-construction

Updated Nov 23, 2019
Python

mynlp / wikilex

Wikipedia Entities Lexicon Extractor

disambiguation lexicon entity-extraction wikipedia-database wikipedia-scraper

Updated Oct 26, 2017
Python

ammarfaizi2 / wikipedia_scraper

Wikipedia Scraper written in PHP

curl wikipedia grabber wikipedia-bot php-curl wikipedia-scraper scraperwiki scarpe grabbing-content

Updated Feb 3, 2018
PHP

marian-code / wikipedia-music-tags

Music tagger with GUI that parses wikipedia for information. Can also download album art and lyrics.

gui pyqt5 music-information-retrieval python-3 pyside2 console-application pyinstaller album-art music-tagger wikipedia-scraper music-tagging lyrics-fetcher lyrics-search

Updated Jun 8, 2020
Python

donomii / wikipedia2geojson

Extracts geodata from a wikipedia dump

converter json geojson mapping wikipedia conversion geodata geotagged-wikipedia-articles wikipedia-dump geotagging wikipedia-scraper

Updated Feb 12, 2020
Go

orange-soda / scrapy-wikipedia

维基百科中文网历史事件爬取Python实现，并通过LaTeX导出为PDF

python wikipedia-scraper

Updated Sep 23, 2018
TeX

Goutam1511 / WikiFinder

Given a topic name this project finds you the most suitable parent topics for the topic, searching the Wikipedia Category Network (WCN) related to that topic by the help of a statistical approach. It helps you form an autogenerated topic tree for a given topic.

javascript css python html jquery php wikipedia rest-api ajax wikipedia-api html-css statistical wikipedia-scraper wcn

Updated Feb 10, 2018
JavaScript

janus-tg / Araneae

python3 spiderbot which scraps a given Wikipedia URL and stores the info from the article in a text file.

python3 webcrawler wikipedia-scraper beautifulsoup4 requests-module

Updated Jul 29, 2020
Python

milosmladenovic5 / football_clubs_logo_scraper

Scraping logos of world football clubs from wikipedia

web-scraping beautifulsoup python-web-crawler wikipedia-scraper

Updated Apr 3, 2018
Python

Harsh-2909 / Wikipedia-Web-Scraper

A Wikipedia Web Scraper used to download all the text information in a .txt file.

python wikipedia webscraper python3 beautifulsoup webscraping wikipedia-scraper beautifulsoup4

Updated Jun 3, 2019
Python

doersino / wikipediastats

Sponsor

A Haskell-powered Twitter bot that posts milestones and statistics of various Wikipedias.

twitter-bot haskell mediawiki wikipedia wikipedia-scraper

Updated Oct 27, 2020
Haskell

emreYbs / Wikipedia-Article-Summarizer

Wikipedia Article Summarizer a simple Python project based on NLP techniques

python nlp machine-learning natural-language-processing jupyter jupyter-notebook python3 nltk summarization nlp-machine-learning wikipedia-scraper nltk-python article-summarization

Updated May 29, 2022
Jupyter Notebook

soulxhacker / Wikipedia-Scraper-Bot

A wikipedia scraper bot made in python.

scraper wikipedia-scraper

Updated May 25, 2017
Python

fredysomy / Python-Personal-Assistant

A Python code for a Personal Assistant which performs various tasks. Search google,Wikipedia, and gives stock prices with Oil prices .

time python3 requests bs4 stock-prices wikipedia-scraper oil-price

Updated Sep 16, 2020
Python

g3ner1c / wikimedia-cli

A minimally dependent Wikimedia CLI

python cli wikipedia wikimedia wikimedia-api wikipedia-scraper wikipedia-cli

Updated Jul 1, 2022
Python

hellmrf / Tweemol

Bot em Node.js que publica, diária e automaticamente, uma molécula de interesse farmacológico, bem como sua estrutura, informações e um link para a Wikipédia.

nodejs javascript twitter-bot wikipedia wikipedia-scraper algorithmia-api custom-search-api

Updated Jun 25, 2022
JavaScript

lucasribolli / searchengine

flask elasticsearch vuejs etl seo scrapy wikipedia-scraper

Updated Jul 25, 2021
Python

wikiref

zaataylor / wikiref

A web extension that makes extracting, editing, and exporting Wikipedia references easy!

json wikipedia extensions firefox-webextension wikipedia-scraper

Updated Dec 1, 2021
JavaScript

GeorgeDavila / WikipediaScrapingWikiAPI

Scraping Wikipedia using the python wrapper of Wikipedia's WikiMedia API

nlp scraper wikipedia wikipedia-api nlp-machine-learning wikipedia-scraper

Updated Sep 15, 2020
Jupyter Notebook

zz-xx / soccerdata-scraper

Scrape soccer data from Wikipedia across various European football leagues and perform interactive data visualizations on it.

python wikipedia json-data numpy league python3 scraped-data european-football-leagues seasons interactive-visualizations wikipedia-scraper beautifulsoup4 plotly-python soccer-data cefpython3 data-visualizations soccerdata-scraper

Updated Nov 7, 2019
Python

Improve this page

Add a description, image, and links to the wikipedia-scraper topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the wikipedia-scraper topic, visit your repo's landing page and select "manage topics."