rasa: Any intent without an extracted entity fails to be classified correctly after upgrading to 1.8.1

Rasa version: 1.8.1

Rasa SDK version (if used & relevant): 1.8.1

Python version: 3.7

Operating system (windows, osx, …): Ubuntu

Issue: I am trying to upgrade Rasa from 1.6 to 1.8 while keeping the configuration as similar as possible. I’m trying to use the EmbeddingIntentClassifier.

All intents that do not have any entity extracted have extremely low confidence scores and most fail to be classified correctly. It even happens when the text is exactly the same as some of the training examples. In many cases, these intents don’t even appear in the intent_ranking.

With the previous version, I would get accuracies for these intents of 80% or more.

I do not know if it is a bug with the Spacy featurization or if I need to do something else in the configuration.

Test metrics are all 0.0:

"interested": {
    "precision": 0.0,
    "recall": 0.0,
    "f1-score": 0.0,
    "support": 18,
    "confused_with": {
      "faq_apartment_available": 6,
      "ask_viewing": 3,
      "answer_phone_number": 3
    }
  }

Sample intent error

{
    "text": "I'm super interested",
    "intent": "interested",
    "intent_prediction": {
      "name": "answer_schedule",
      "confidence": 0.13512466847896576
    }
  }

Error (including full traceback):

rasa shell nlu
NLU model loaded. Type a message and press enter to parse it.
Next message:
interested
{
  "intent": {
    "name": "faq_apartment_location",
    "confidence": 0.2963590621948242
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "faq_apartment_location",
      "confidence": 0.2963590621948242
    },
    {
      "name": "faq_apartment_available",
      "confidence": 0.19111353158950806
    },
    {
      "name": "ask_viewing",
      "confidence": 0.09869010746479034
    },
    {
      "name": "faq_apartment_move_in_date",
      "confidence": 0.0869259387254715
    },
    {
      "name": "faq_generic_identity",
      "confidence": 0.06372860819101334
    },
    {
      "name": "no_preferences",
      "confidence": 0.0610114187002182
    },
    {
      "name": "ask_search_apartments",
      "confidence": 0.05694292113184929
    },
    {
      "name": "answer_phone_number",
      "confidence": 0.05568307265639305
    },
    {
      "name": "interested",
      "confidence": 0.04589391499757767
    },
    {
      "name": "ask_talk_to",
      "confidence": 0.04365147277712822
    }
  ],
  "text": "interested"
}

Content of configuration file (config.yml) (if relevant): I tried different ones. I came up with the most basic one which fails for demo purpose.

language: "en"
pipeline:
  - name: "DucklingHTTPExtractor"
    url: "http://*************"
    dimensions: ["time", "duration", "amount-of-money", "number", "email", "phone-number", "ordinal", "url"]
    timezone: "America/New_York"
  - name: "SpacyNLP"
    case_sensitive: true
  - name: "SpacyTokenizer"
  - name: "SpacyEntityExtractor"
    dimensions: ["PERSON", "MONEY"]
  - name: "RegexFeaturizer"
  - name: "SpacyFeaturizer"
  - name: "EntitySynonymMapper"
  - name: "EmbeddingIntentClassifier"

Content of domain file (domain.yml) (if relevant): All of the intents are in there

intents:
- affirm
- deny
- reset:
    triggers: action_reset_full
...

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (22 by maintainers)

Most upvoted comments

I think I found what we changed that made it worse. I’ll prepare a PR soon to fix it, we’ll make it part of 1.9 release. Meanwhile, I would recommend to try:

  - name: "DIETClassifier"
    epochs: 100
    entity_recognition: False

instead of EmbeddingIntentClassifier it is sequential model therefore it takes longer to train

That’s right, not the best practice ever but well ^^

could you please send it to support @rasa.com with the link to this issue. Thank you