COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The Wayback Machine - https://web.archive.org/web/20200803112243/https://github.com/topics/document-similarity
Here are
32 public repositories
matching this topic...
Topic Modelling for Humans
Updated
Aug 3, 2020
Python
Compute Sentence Embeddings Fast!
Updated
Mar 2, 2020
Python
Telegram Data Clustering contest solution by Mindful Squirrel
A Clojure library for querying large data-sets on similarity
Updated
Feb 17, 2019
Clojure
Document Search Engine Tool
Updated
Jul 4, 2020
Python
Web Application for checking the similarity between query and document using the concept of Cosine Similarity.
Updated
Jul 29, 2020
Python
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Updated
Apr 21, 2020
Jupyter Notebook
Updated
Jun 12, 2020
Jupyter Notebook
Using Jaccard-Similarity and Minhashing to determine similarity between two text documents
Updated
Mar 3, 2018
Jupyter Notebook
Compilation of Natural Language Processing (NLP) codes. BONUS: Link to Information Retrieval (IR) codes compilation. (checkout the readme)
Updated
Jul 15, 2020
Python
Updated
Jun 7, 2019
Python
WebApplication for Similarity between Professor and Keyword based on WordEmbedding
Updated
Jan 31, 2019
Jupyter Notebook
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
Updated
Apr 22, 2020
Rust
A tool which can find your any document using semantic search
Updated
May 14, 2020
Python
The Bitnation Jurisdiction Public Notary DApp
Updated
Sep 28, 2018
JavaScript
Code to train a LSI model using Pubmed OA medical documents and to use pre-trained Pubmed models on your own corpus for document similarity.
Updated
Feb 17, 2019
Python
This is a program used to check document similarity using Natural Language Tool Kit,using Cosine Similarity.
Updated
Aug 9, 2018
Python
Document ranking word embeddings
Updated
Jan 2, 2020
Python
My Bachelor Thesis in Computer Science, FER, University of Zagreb
Document searching from queries using Inverted index
Updated
Apr 30, 2018
Python
A system for automatic tagging of metadata of theses and dissertations from Bicol University
Updated
Sep 1, 2018
Python
Natural Lang processing scripts
Updated
May 29, 2018
Jupyter Notebook
Simple document similarity module implemented in NodeJS
Updated
Jan 20, 2018
JavaScript
Document similarity using cosine distance, tf-idf, and latent semantic analysis.
Updated
Sep 17, 2017
Python
Index documents in Apache Solr and see similarities in the document's contents.
Updated
Sep 20, 2017
Java
Updated
Sep 14, 2017
Python
Document Similarity with Apache Spark using Locality Sesitive Hashing and Python
Updated
Mar 26, 2020
Jupyter Notebook
Classifying news articles with deep learning to build an automatic newsletter
Updated
Jul 19, 2018
Jupyter Notebook
Telegram Data Clustering Contest (Bossy Gnu)
Improve this page
Add a description, image, and links to the
document-similarity
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
document-similarity
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.