rasa: Any intent without an extracted entity fails to be classified correctly after upgrading to 1.8.1
Rasa version: 1.8.1
Rasa SDK version (if used & relevant): 1.8.1
Python version: 3.7
Operating system (windows, osx, …): Ubuntu
Issue:
I am trying to upgrade Rasa from 1.6 to 1.8 while keeping the configuration as similar as possible. I’m trying to use the EmbeddingIntentClassifier.
All intents that do not have any entity extracted have extremely low confidence scores and most fail to be classified correctly. It even happens when the text is exactly the same as some of the training examples. In many cases, these intents don’t even appear in the intent_ranking.
With the previous version, I would get accuracies for these intents of 80% or more.
I do not know if it is a bug with the Spacy featurization or if I need to do something else in the configuration.
Test metrics are all 0.0:
"interested": {
"precision": 0.0,
"recall": 0.0,
"f1-score": 0.0,
"support": 18,
"confused_with": {
"faq_apartment_available": 6,
"ask_viewing": 3,
"answer_phone_number": 3
}
}
Sample intent error
{
"text": "I'm super interested",
"intent": "interested",
"intent_prediction": {
"name": "answer_schedule",
"confidence": 0.13512466847896576
}
}
Error (including full traceback):
rasa shell nlu
NLU model loaded. Type a message and press enter to parse it.
Next message:
interested
{
"intent": {
"name": "faq_apartment_location",
"confidence": 0.2963590621948242
},
"entities": [],
"intent_ranking": [
{
"name": "faq_apartment_location",
"confidence": 0.2963590621948242
},
{
"name": "faq_apartment_available",
"confidence": 0.19111353158950806
},
{
"name": "ask_viewing",
"confidence": 0.09869010746479034
},
{
"name": "faq_apartment_move_in_date",
"confidence": 0.0869259387254715
},
{
"name": "faq_generic_identity",
"confidence": 0.06372860819101334
},
{
"name": "no_preferences",
"confidence": 0.0610114187002182
},
{
"name": "ask_search_apartments",
"confidence": 0.05694292113184929
},
{
"name": "answer_phone_number",
"confidence": 0.05568307265639305
},
{
"name": "interested",
"confidence": 0.04589391499757767
},
{
"name": "ask_talk_to",
"confidence": 0.04365147277712822
}
],
"text": "interested"
}
Content of configuration file (config.yml) (if relevant): I tried different ones. I came up with the most basic one which fails for demo purpose.
language: "en"
pipeline:
- name: "DucklingHTTPExtractor"
url: "http://*************"
dimensions: ["time", "duration", "amount-of-money", "number", "email", "phone-number", "ordinal", "url"]
timezone: "America/New_York"
- name: "SpacyNLP"
case_sensitive: true
- name: "SpacyTokenizer"
- name: "SpacyEntityExtractor"
dimensions: ["PERSON", "MONEY"]
- name: "RegexFeaturizer"
- name: "SpacyFeaturizer"
- name: "EntitySynonymMapper"
- name: "EmbeddingIntentClassifier"
Content of domain file (domain.yml) (if relevant): All of the intents are in there
intents:
- affirm
- deny
- reset:
triggers: action_reset_full
...
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 22 (22 by maintainers)
I think I found what we changed that made it worse. I’ll prepare a PR soon to fix it, we’ll make it part of
1.9release. Meanwhile, I would recommend to try:instead of
EmbeddingIntentClassifierit is sequential model therefore it takes longer to trainThat’s right, not the best practice ever but well ^^
could you please send it to support @rasa.com with the link to this issue. Thank you