rasa: PolyAI Models - model.tar.gz - No Longer Available?

Rasa version: 2.0.0rc2 Python version: 3.6.12

Operating system (windows, osx, …): OSX

Issue: Failed to run rasa init. Upon checking the model source, it is no longer available. It was taken down 2 hours ago upon posting this.

Edited to include the message from https://github.com/PolyAI-LDN/polyai-models:

After much consideration, the PolyAI team has decided to take down the ConveRT models from the public domain. Over the course of last year, we have been very excited to see ConveRT gaining a huge amount of traction in various communities - that was something we didn’t expect when we first released it. However, with the amount of business growing and the shift of our team’s priorities, we no longer have the resources to responsibly maintain or provide support for these models. PolyAI is working to create end-to-end voice assistants. If you’re interested in helping us, check out our careers page at polyai.com/careers. On the other hand, if you are interested in knowing how our solutions can help you transform your contact center, please get in touch at contact@polyai.com.

Error (including full traceback):

Warning: Output is not to a terminal (fd=1).
Warning: Input is not to a terminal (fd=0).
2020-09-28 04:00:48 INFO     absl  - Using /tmp/tfhub_modules to cache modules.
2020-09-28 04:00:48 INFO     absl  - Downloading TF-Hub Module 'https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz'.
Welcome to Rasa! 🤖

To get started quickly, an initial project will be created.
If you need some help, check out the documentation at https://rasa.com/docs/rasa.

Created project directory at '/usr/src/app'.
Finished creating project structure.
Training an initial model...
The configuration for policies and pipeline was chosen automatically. It was written into the config file at './config.yml'.
Training NLU model...
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 142, in load_tf_hub_model
    return tfhub.load(model_url)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/module_v2.py", line 97, in load
    module_path = resolve(handle)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/module_v2.py", line 53, in resolve
    return registry.resolver(handle)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/registry.py", line 42, in __call__
    return impl(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 88, in __call__
    self._lock_file_timeout_sec())
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/resolver.py", line 415, in atomic_download
    download_fn(handle, tmp_dir)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 83, in download
    response = self._call_urlopen(request)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 96, in _call_urlopen
    return url.urlopen(request)
  File "/usr/local/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/rasa/__main__.py", line 113, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/usr/local/lib/python3.6/site-packages/rasa/cli/scaffold.py", line 218, in run
    init_project(args, path)
  File "/usr/local/lib/python3.6/site-packages/rasa/cli/scaffold.py", line 128, in init_project
    print_train_or_instructions(args, path)
  File "/usr/local/lib/python3.6/site-packages/rasa/cli/scaffold.py", line 68, in print_train_or_instructions
    args.model = rasa.train(domain, config, training_files, output)
  File "/usr/local/lib/python3.6/site-packages/rasa/train.py", line 55, in train
    loop,
  File "/usr/local/lib/python3.6/site-packages/rasa/utils/common.py", line 300, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/usr/local/lib/python3.6/site-packages/rasa/train.py", line 110, in train_async
    nlu_additional_arguments=nlu_additional_arguments,
  File "/usr/local/lib/python3.6/site-packages/rasa/train.py", line 207, in _train_async_internal
    old_model_zip_path=old_model,
  File "/usr/local/lib/python3.6/site-packages/rasa/train.py", line 246, in _do_training
    additional_arguments=nlu_additional_arguments,
  File "/usr/local/lib/python3.6/site-packages/rasa/train.py", line 543, in _train_nlu_with_validated_data
    **additional_arguments,
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/train.py", line 97, in train
    trainer = Trainer(nlu_config, component_builder)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/model.py", line 159, in __init__
    self.pipeline = self._build_pipeline(cfg, component_builder)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/model.py", line 171, in _build_pipeline
    component = component_builder.create_component(component_cfg, cfg)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/components.py", line 760, in create_component
    component = registry.create_component_by_config(component_config, cfg)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/registry.py", line 163, in create_component_by_config
    return component_class.create(component_config, config)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/components.py", line 464, in create
    return cls(component_config)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/tokenizers/convert_tokenizer.py", line 44, in __init__
    self.module = train_utils.load_tf_hub_model(self.model_url)
  File "/usr/local/lib/python3.6/site-packages/rasa/utils/train_utils.py", line 146, in load_tf_hub_model
    return tfhub.load(model_url)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/module_v2.py", line 97, in load
    module_path = resolve(handle)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/module_v2.py", line 53, in resolve
    return registry.resolver(handle)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/registry.py", line 42, in __call__
    return impl(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 88, in __call__
    self._lock_file_timeout_sec())
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/resolver.py", line 415, in atomic_download
    download_fn(handle, tmp_dir)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 83, in download
    response = self._call_urlopen(request)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_hub/compressed_module_resolver.py", line 96, in _call_urlopen
    return url.urlopen(request)
  File "/usr/local/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
ERROR: Service 'rasa' failed to build: The command '/bin/sh -c rasa init --no-prompt' returned a non-zero code: 1

Command or request that led to error:

rasa init

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 26 (6 by maintainers)

Most upvoted comments

The previously publicly available ConveRT models appear to have been licensed under the Apache 2.0 license, making redistribution permissible. If anyone has the official files for the ConveRT models, it would be great to have them redistributed under the same license here.

I’ve repackaged the loaded model I have running in production, and released it here (under the Apache 2.0 license): https://github.com/connorbrinton/polyai-models/releases/tag/v1.0

To use this model, you’ll need to either:

  • Retrain your models, with the new model_url configuration parameter for ConveRTTokenizer set to the URL of the new model location, or
  • Monkey-patch the loader to redirect requests for the original model locations to the new model location

Unfortunately, the ConveRT model was taken offline. We are working on a long-term solution and will keep you updated. In the mean time we recommend to remove ConverRT from your pipeline and just use supervised embeddings, such as CountVectorsFeaturizer. For example, you could change the default config.yml created by rasa init to the following to train your model.

language: en

pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
- name: DIETClassifier
  epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
  epochs: 100

policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
#   - name: MemoizationPolicy
#   - name: TEDPolicy
#     max_history: 5
#     epochs: 100
#   - name: RulePolicy

We are sorry for any inconveniences!

@lohithpro It’s been a while, so I don’t remember exactly how I pulled the model off of my production servers, but I think I directly copied the downloaded model .tar.gz from the TFHUB_CACHE_DIR directory. I believe that would give the original PolyAI model pre-fine-tuning, but I could be wrong 😅

I would definitely recommend that people only rely on my uploaded ConveRT models as a stop-gap solution. NLP moves fast, so there are probably much better models out there already. I now use HuggingFace (LanguageModelFeaturizer) models instead of ConveRT models, and we’re investigating Google’s new conditional masked language model (which hasn’t gotten a lot of press yet). It’s an exciting time to be working with dialog systems! 😄

@magdalini-anastasiadou I had the same problem.
I ended up hacking rasa/nlu/featurizers/dense_featurizer/convert_featurizer.py:27 and rasa/nlu/tokenizers/convert_tokenizer.py:10, just to get up and running. commenting out those lines and replacing them with the model url https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz and that worked. FWIW I also removed the model_url param from the config.yml to avoid double downloading that model, which is kinda large.

With polyAI removing the convert models does Rasa have another recommended configuration? I was a bit worried about continuing to try to use convert with the current licensing. For now I’ve gone back to using the default configuration for Rasa. Any clarification on this issue would be appreciated. Thanks!

@magdalini-anastasiadou I also had this same issue. I fix it by using the model_url property under ConveRTFeaturizer also. pipeline:

@aizest Have you upgraded to Rasa 1.10.14? The changes making the model URL configurable are very recent, so you may need to upgrade to a version of Rasa that includes those changes in order for the model_url property to work slightly_smiling_face

I have Rasa 1.10.14 and started the config file

pipeline:
  - name: ConveRTTokenizer
    model_url: https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz

and still it tries to download the previous url, ending with an HTTP Error 404: Not Found. Any help on how to solve this? I tried to delete everything from /tmp/thfub_modules and also to alter the code in convert_tokenizer.py but still the same

Training NLU model...
2020-10-07 15:38:15 INFO     absl  - Using /tmp/tfhub_modules to cache modules.
2020-10-07 15:38:15 INFO     absl  - Downloading TF-Hub Module 'https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz'.
2020-10-07 15:39:20 INFO     absl  - Downloading https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz: 52.35MB
2020-10-07 15:39:37 INFO     absl  - Downloading https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz: 148.59MB
2020-10-07 15:39:38 INFO     absl  - Downloaded https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz, Total size: 152.02MB
2020-10-07 15:39:38 INFO     absl  - Downloaded TF-Hub Module 'https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz'.
2020-10-07 15:39:43 INFO     absl  - Downloading TF-Hub Module 'https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz'.

@connorbrinton Got it. Have to make some changes in the code, as we’re using a lower version. Really appreciate the quick feedback!