#
multi-modal-learning
Here are 31 public repositories matching this topic...
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
-
Updated
Jun 23, 2022
A concise but complete implementation of CLIP with various experimental improvements from recent papers
-
Updated
Jun 23, 2022 - Python
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
nlp
computer-vision
pytorch
chinese
multi-modal-learning
image-text-retrieval
vision-and-language-pre-training
-
Updated
Jul 13, 2022 - Python
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
deep-learning
cnn
pytorch
multi-modal
image-registration
affine-transformation
stn
image-to-image-translation
multimodal
deformable-transformation
multi-modal-learning
cvpr2020
registartion
multimodal-image-registration
-
Updated
Aug 2, 2020 - Python
Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation
deep-learning
pytorch
segmentation
image-segmentation
medical-image-processing
3d-convolutional-network
3d-cnn
pytorch-cnn
medical-image-segmentation
hyperdensenet
multi-modal-imaging
multi-modal-learning
-
Updated
Nov 20, 2019 - Python
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
visualization
pytorch
transformer
attention
official
multi-modal
clevr
visual-question-answering
vision-and-language
dynamic-network
multi-modality
multi-modal-learning
multi-scale-features
vqav2
iccv2021
local-and-global
-
Updated
Oct 11, 2021 - Python
MMEA: Entity Alignment for Multi-Modal Knowledge Graphs
-
Updated
Jun 4, 2022 - Python
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018
deep-neural-networks
tensorflow
multi-task-learning
efficient-inference
cnn-compression
unifying-and-merging-cnn
multi-modal-learning
-
Updated
Jan 30, 2021 - Python
A python tool to perform deep learning experiments on multimodal remote sensing data.
-
Updated
Jan 23, 2022 - Python
Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.
-
Updated
May 10, 2021 - Jupyter Notebook
Multi-model analysis of sentiment and emotion in multi-speaker conversations.
deep-learning
sentiment-classification
emotion-recognition
graph-neural-networks
multi-modal-learning
-
Updated
Dec 18, 2019 - Jupyter Notebook
Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf
deep-learning
grad-cam
medical-imaging
self-learning
weakly-supervised-learning
class-activation-maps
brats
class-activation-map
multi-modal-imaging
multi-modal-learning
weakly-supervised-segmentation
brats2019
grad-cam-visualization
transformation-equivariance
-
Updated
Feb 4, 2022 - Python
PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".
-
Updated
Apr 23, 2022 - Python
-
Updated
Dec 18, 2021 - Jupyter Notebook
Multi-Modal action recognition for skeleton sequences, inertial measurements, motion capturing data and Wi-Fi CSI fingerprints.
computer-vision
sensor-fusion
action-recognition
skeleton-based-action-recognition
multi-modal-learning
-
Updated
Apr 1, 2021 - Python
This repository shows how to implement a basic model for multimodal entailment.
-
Updated
Aug 17, 2021 - Jupyter Notebook
DramaQA Starter Code (2021)
-
Updated
Aug 3, 2021 - Python
SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.
graph-convolutional-networks
sign-language-recognition-system
sign-language-recognition
multi-modal-learning
-
Updated
Oct 20, 2021 - Python
M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021
-
Updated
Oct 27, 2021 - Python
A curated list of vision-and-language pre-training (VLP). :-)
-
Updated
Jul 6, 2022
A curated list of papers and experiments in the field of Natural Language Processing (NLP)
nlp
text-classification
natural-language
word-embeddings
multi-lingual
text-generation
transfer-learning
bert
cross-lingual
sentence-embeddings
cross-lingual-embeddings
multi-modal-learning
bert-embedding
language-representations
contextual-representations
-
Updated
Dec 1, 2021
PyTorch implementation of the paper: All For One: Multi-modal Multi-Task Learning
deep-learning
sentiment-classification
multi-task-learning
visual-question-answering
vision-and-language
multi-modal-learning
-
Updated
Jul 17, 2020 - Python
Pytorch implementation of "Multi-domain translation between single-cell imaging and sequencing data using autoencoders" (https://www.nature.com/articles/s41467-020-20249-2) with custom models.
multi-domain
single-cell
multi-modal
single-cell-rna-seq
shared-embedding
multi-view-learning
single-cell-omics
multi-view
data-alignment
multi-modal-learning
multi-domain-adaptation
-
Updated
Oct 13, 2021 - Python
This is the code for our ICCV'19 paper on cross-modal learning and retrieval.
retrieval
tensorflow
semantic-similarity
iccv
scene-understanding
retrieval-systems
caption-retreival
multi-modal-learning
cross-modal-learning
-
Updated
Jun 24, 2020
The code of the paper: M. Karami, D. Schuurmans, "Deep Probabilistic Canonical Correlation Analysis" AAAI 2021
deep-learning
deep
dnn
generative-model
vae
canonical-correlation-analysis
multi-view-learning
multi-modal-learning
-
Updated
Mar 29, 2022 - Python
Official Repo for "To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo", ACL 2022 (Short)
-
Updated
May 23, 2022
A blog for Project Aligned.
sentiment-analysis
emotion-analytics
emotion-analysis
affective-computing
emotion-recognition
multi-modal-learning
-
Updated
Mar 2, 2021 - HTML
Japanese CLIP by rinna Co., Ltd.
-
Updated
May 11, 2022 - Python
Compute contextual lexical embeddings for textual summaries, using the CLIP v1.0 transformer model.
python
nlp
neural-network
pytorch
nlp-machine-learning
pyenv
virtual-environment
embedding-vectors
multi-modal-learning
embedding-evaluation
-
Updated
Jun 30, 2022 - Jupyter Notebook
Improve this page
Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."


I've been chatting with some others interested in training CLIP for different domain tasks. They expressed interest in a simple way to use a pre-trained text transformer.
Some basic support for Hugging Face or generic classes of transformers shouldn't be too crazy of an extension to what is already fleshed out.