rasa: Capitalization throwing off the tensorflow_embedding classifier
Rasa NLU version: 0.12.2
Operating system: Windows 10
Content of model configuration file:
language: "en"
pipeline: "tensorflow_embedding"
Issue:
Capitalization is seriously messing up the intent classification for a model I trained using the new pipeline tensorflow_embedding.
Example (I’m just posting the relevant output from the parser):
'text': 'no'
'intent': {'confidence': 0.9569746255874634, 'name': 'disagree'}
'text': 'No'
'intent': {'confidence': 0.6564008593559265, 'name': 'disagree'}
# See the lower confidence
#----
'text': 'yes'
'intent': {'confidence': 0.9270809888839722, 'name': 'agree'}
'text':'Yes'
'intent': {'confidence': 0.6564008593559265, 'name': 'disagree'}
# It's classifying it completely wrongly.
# (variations like 'yEs', 'yES', and 'YES' also gives the exact same confidences as 'Yes')
#----
'text':'hi'
'intent': {'confidence': 0.8774316310882568, 'name': 'greet'}
'text':''Hi'
'intent': {'confidence': 0.6564008593559265, 'name': 'disagree'}
# Again completely wrong!
I have no capital letters in any of my training data utterances.
I have trained another model using the same data with the spacy_sklearn pipeline which gives me exact to the last digit same confidence in intents however I capitalize my input to the model.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 16 (15 by maintainers)
This is an interesting usability issue. There are a number of ways you can remedy this, and we should document this better.
The simplest is to pass a preprocessor to the
CountVectorFeaturizerwhich just lowercases everything. Then “Hi” and “hi” get mapped to the same feature.Another approach is to add
nlp_spacyandtokenizer_spacyto the pipeline, because if spaCy is present, we will actually replace each token with itslemma. We didn’t do that by default bc then you would still have to load a spaCy model.Here is my pipeline and indeed i load spaCy’s language model for tokenization that could be the reason why i am getting better results with tensorflow. Wasn’t aware of that 👍
For our work, i switched to tensorflow at the moment because it is giving better results compared to SpaCy’s default model. But it is a narrow domain chatbot focussed on a proper set of questions. There are some niche edge cases we see but that are usually handled by the art of interrogation and asking the proper question to your user.