nlp
bot
machine-learning
natural-language-processing
bots
botkit
chatbot
bot-framework
nlu
spacy
mitie
chatbots
machine-learning-library
wit
rasa
conversational-agents
conversational-bots
chatbots-framework
conversational-ai
conversation-driven-development
-
Updated
Jul 26, 2021 - Python


Description
While using tokenizers.create with the model and vocab file for a custom corpus, the code throws an error and is not able to generate the BERT vocab file
Error Message
ValueError: Mismatch vocabulary! All special tokens specified must be control tokens in the sentencepiece vocabulary.
To Reproduce
from gluonnlp.data import tokenizers
tokenizers.create('spm', model_p