rasa: Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers
Rasa version: 2.4.1
Issue:
When rules/RulePolicy are included in dialogue management, in order to check for conflicts with stories, effectively infinite max_history is used. This greatly increases the time it takes to load and process dialogue data.
Command or request that led to error: edit train, not validate
rasa train core
Definition of Done
- set
max_historyto lowest possible duringrasa train corebased on the training data
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (21 by maintainers)
Sorry, it was the wrong link 🙈 Updated it with the correct one 👍🏻
We can set a
max_sizefor the cache 👍🏻 It’s just a couple jsons and their dict representations so I’d expect it be tiny.So next steps:
Posting the details for the memory usage here as I haven’t created the PR yet and need to put the results somewhere 😄
Investigation results on
rasa-demo:RulePolicydoesn’t change thismax_historyfor the rule generation leads to contradictions 🤔RulePolicytraining (see results here - viewable with snakeviz). the results stress that the contradiction check takes the longest amount of time. In my opinion this clearly shows that it’s not related tomax_historybut rather to the amount of predictions. The biggest part seems to be caused by thejson.loadscalls. If we actually wanna do something then we need to revisit the mechanics of the policy itself (e.g. how data is serialized) and how we can speed up predictions. This should in my opinion be done in a separate issue.A quick fix for this issue would to use lru_cache (with
max_size> 1000) on_rule_key_to_state. By doing this I lowered the contradiction check time from 13 seconds to ~4 seconds. The issue is that it rather addresses symptoms than causes.Yes, you did! Rule trackers are already deduplicated separately. So your 2nd approach would be completely feasible.
I’ll do some debugging later to find out where we spend so much time exactly (during deduplication or featurization) and then maybe a PR so we have something more tangible to discuss.
That makes more sense I think 🤔 I think what you’re then suggesting to set
max_historyhere in an automated fashion then? What do you think of that @samsucik ?