rasa: Changing duckling url shouldn't require a model retrain

Rasa version: Rasa core 0.14.0 Rasa nlu 0.14.4 Python version: 3.6.8 Operating system (windows, osx, …): osx Issue: rasa nlu model creation takes the duckling URL from the config.yml file and puts it into the metadata.json file of the trained model. we use docker-compose for local testing and k8s for cloud test/prod. docker and k8s use different way to network between containers; docker uses named containers eg duckling and k8s uses localhost. So we need different duckling url in local vs cloud testing. we’ve separated the URL’s in environment files but the Rasa training puts the URL into the metadata.json file of the model. This means that the model has to be retrained between local (docker-compose) and cloud (k8s-docker) testing. It makes more sense to have the URL outside of the model in a config file that can be controlled with environment and build processes so that the trained model can be copied rather than retrained (for no reason other than URL change due to environment). eg. for docker-compose “url”: “http://duckling:8000”, for k8s “url”: “http://localhost:8000”,

Content of configuration file (config.yml):

for docker-compose:

pipeline:
# other stuff
  - name: ner_duckling_http
    url: http://duckling:8000

for cloud k8s:
pipeline:
# other stuff
  - name: ner_duckling_http
    url: http://localhost:8000

Content of domain file (domain.yml) (if used & relevant):

not relevant

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (11 by maintainers)

Most upvoted comments

We have to change the duckling url regularly because dev and prod environments are different. So frequency of needing to change this is daily. for docker-compose “url”: “http://duckling:8000”, for k8s “url”: “http://localhost:8000

Because

  • it’s messy in the code
  • it’s only a tiny tiny advancement if we don’t retrain in case the duckling url is changed (how often are you changing your duckling url?)

So basically the relation between benefit and effort is very bad.

We use an env variable for the Duckling URL. However, changing the value while leaving the config.yml unchanged still is detected as a change that requires retraining. Is the Duckling component used during training? If it is not the case, maybe it would make sense to only replace the env variable at inference time.

I do understand that this may be low in the priority list, though 😃 For us, it meant budgeting the Duckling component in a different way. It could also be explained in the documentation for people deploying Rasa using separate microservices instead of one big Kubernetes.

This problem still exists and is very uninituitv. Every endpoint can be configured in the endpoints.yml except the duckling part. Makes the automate deployment e.g. via helm very messy.

Thanks, fixed it 😃

This might be a work around. But I think the URL should not be put into the model in the first place. The URL should be read from a YML file (config, .env or endpoints). We’ll give the environment variable a go. Thanks.