information-retrieval
Here are 1,034 public repositories matching this topic...
-
Updated
Jun 10, 2020 - Java
I'm submitting a ... (check one with "x")
[x] bug report
[ ] new distro request
If I try to change default colors with -c
option, bold is removed from labels:
I didn't find any option t
-
Updated
Jun 1, 2020 - Python
I would like to process corpus of documents by TFIDF model. My corpus is one txt file where each line is document. It is fine as input for any models from pke, but for TFIDF I need a document frequency matrix which can be generated in pke utilities but it accept only input_dir where files are documents. It would convenient to have option to inject a documents as one file as for all models.
Th
-
Updated
Aug 19, 2019 - Python
-
Updated
Jun 9, 2020 - C#
-
Updated
Jun 5, 2017 - Python
Hi.
This is not an issue, but maybe an enhancement request.
I'm trying to use anserini for custom collection, and it's rather hard for me to figure out how to build pipeline from scratch.
For example, however there are a lot of scripts about how to reproduce some particular results, I couldn't find any information about format of document, which anserini takes as input to index custom collect
prediction should include a hyper link to the answer. On clicking the answer from UI will open the pdf page where the answer/paragraph
-
Updated
Dec 24, 2018 - Python
Given recent feedback from HN, we should look at improving how we explain PISA, and offer some benchmarks to common systems like Lucene and Tantivy (perhaps).
We also should document some things such as:
- Use cases
- Assumptions (in memory)
- Target audience and why you would want to use it
- Limitations
- Algorithms implemented (in terms of the basics, ie top-k search, Boolean matching
-
Updated
Jun 10, 2020
-
Updated
May 28, 2020 - Python
-
Updated
Feb 21, 2020 - Elixir
Issue Description
It would be cool to override the config file as a whole on the cmd line so that lots of options could be updated in one place.
How to reproduce it
Environment and Version Information
All environments.
An external links for reference
Contributing
I'll fix this.
-
Updated
Jan 12, 2020 - Python
-
Updated
Jun 1, 2020 - OpenEdge ABL
-
Updated
Jun 2, 2020 - C
-
Updated
Apr 26, 2019 - Jupyter Notebook
-
Updated
May 25, 2020 - Python
"Trinity Search" seems to be saturated:
https://www.google.com/search?q=trinity+search&ie=utf-8&oe=utf-8
I saw Mark's comment in a Hacker News post and had to google "mark papadakis" to find this repo.
-
Updated
May 27, 2020 - C++
-
Updated
May 8, 2018 - Java
Current docker image size is insane. It is 2.55 GB. Reduce that to below 1GB or less. Apply changes from this reference: https://hackernoon.com/tips-to-reduce-docker-image-sizes-876095da3b34
-
Updated
Jan 5, 2019 - Python
-
Updated
May 9, 2020 - Python
-
Updated
May 29, 2020 - Python
-
Updated
Nov 2, 2017 - Python
-
Updated
Apr 23, 2015 - C++
Improve this page
Add a description, image, and links to the information-retrieval topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the information-retrieval topic, visit your repo's landing page and select "manage topics."
Example (from TfidfTransformer)
This method expects a list of tuples, instead of an iterable. This means that the entire corpus has to be stored as a lis