The Wayback Machine - https://web.archive.org/web/20210730171638/https://github.com/topics/simhash
Here are
43 public repositories
matching this topic...
Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang
Interesting (non-cryptographic) hashes implemented in pure Python.
Updated
Mar 8, 2018
Python
A simple implementation of simhash algorithm by java.
Updated
Oct 10, 2020
Java
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
Simhash implementation in Javascript
Updated
Jun 29, 2017
JavaScript
A text similarity by simhash
A simhasher for Chinese documents implemented by golang, simply translated from yanyiwu/gosimhash
Elixir SimHash NIFs written in Rust
Updated
Jan 20, 2021
Elixir
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
Updated
Sep 19, 2020
Python
SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, Simhash and SimhashIndex
Updated
Nov 26, 2018
Python
Open Source Implementation of Simhash in Python
Updated
Sep 14, 2017
Python
Updated
Apr 25, 2020
Python
A rewrite of Bookmate's simhash gem, which is an implementation of Moses Charikar's simhashes in Ruby.
Updated
Oct 16, 2018
Ruby
Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.
Updated
Aug 21, 2019
Python
A fast python implementation of the SimHash algorithm.
Updated
Mar 9, 2021
Python
A library for cosine similarity & simhash calculation
Updated
Jul 21, 2021
Elixir
This system evaluates a collection of mementos (archived web pages) to determine which are off topic. The collection can be part of an Archive-It collection, a single TimeMap, or stored in a WARC file.
Updated
Jun 14, 2021
Python
Find duplicate text files.
Updated
Oct 14, 2020
Python
Simhash implementation in Javascript
Updated
Jun 30, 2017
JavaScript
Updated
May 8, 2021
JavaScript
⌨️ User Verification based on Keystroke Dynamics / Two-factor Authentication technology based on Key-Stroke
Updated
Mar 10, 2021
Python
a Golang implementation of Simhash Algorithm
Code plagiarism system based on Simhash and Nicad.
Updated
Dec 22, 2018
Python
Updated
Sep 19, 2018
Rust
A barebones implementation of the simhash data sketching algorithm.
Simhash algorithm using Jcseg for word segment, jenkins-hash for hash. Written in Scala
Updated
Feb 10, 2017
Scala
College project (Analysis of massive data sets) - C# implementation of big data algorithms (2017/2018)
A Research Project Thumbnail Visualization to summarize the webpage changes over time
Updated
Dec 20, 2018
JavaScript
Improve this page
Add a description, image, and links to the
simhash
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
simhash
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.