rasa: Training Rasa NLU docker image on Amazon Linux fails with BrokenProcessPool error
Rasa NLU version: Latest
Operating system (windows, osx, …): Amazon Linux on AWS
Content of model configuration file: Attached
Issue: I am able to run and train the Rasa NLU on Ubuntu successfully. But, when I try to do the same on Amazon Linux the training step generates an error.
Could any of the dev team help me resolve this issue?
Following are the details
Step1: Running the Docker image with volume mapping
docker run -d -p 5000:5000 -v
pwd/data:/app/data -vpwd/logs:/app/logs -vpwd/proj:/app/projects rasa/rasa_nlu:latest-full
Step 2: Training with demo data
cat demo-rasa.json | curl --request POST --header ‘content-type: application/json’ -d@- --url ‘localhost:5000/train?project=test_model’
This is failing with an error. Container logs shows the following message.
docker logs 7b831341f489
2018-07-12 09:02:12+0000 [-] Log opened. 2018-07-12 09:02:12+0000 [-] Site starting on 5000 2018-07-12 09:02:12+0000 [-] Starting factory <twisted.web.server.Site object at 0x7f4990e12438> 2018-07-12 09:02:16+0000 [-] “172.17.0.1” - - [12/Jul/2018:09:02:15 +0000] “GET /status HTTP/1.1” 200 294 “-” “curl/7.53.1” 2018-07-12 09:02:26+0000 [-] 2018-07-12 09:02:26 WARNING rasa_nlu.data_router - [Failure instance: Traceback (failure with no frames): <class ‘concurrent.futures.process.BrokenProcessPool’>: A process in the process pool was terminated abruptly while the future was running or pending. 2018-07-12 09:02:26+0000 [-] ] 2018-07-12 09:02:26+0000 [-] Unhandled Error Traceback (most recent call last): File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 500, in errback self._startRunCallbacks(fail) File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 567, in _startRunCallbacks self._runCallbacks() File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 653, in _runCallbacks current.result = callback(current.result, *args, **kw) File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 1442, in gotResult _inlineCallbacks(r, g, deferred) — <exception caught here> — File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 1384, in _inlineCallbacks result = result.throwExceptionIntoGenerator(g) File “/usr/local/lib/python3.6/site-packages/twisted/python/failure.py”, line 422, in throwExceptionIntoGenerator return g.throw(self.type, self.value, self.tb) File “/app/rasa_nlu/server.py”, line 351, in train RasaNLUModelConfig(model_config), model_name) File “/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py”, line 653, in _runCallbacks current.result = callback(current.result, *args, **kw) File “/app/rasa_nlu/data_router.py”, line 325, in training_errback failure.value.failed_target_project) builtins.AttributeError: ‘BrokenProcessPool’ object has no attribute ‘failed_target_project’
2018-07-12 09:02:26+0000 [-] “172.17.0.1” - - [12/Jul/2018:09:02:24 +0000] “POST /train?project=test_model HTTP/1.1” 500 5711 “-” “curl/7.53.1” demo-rasa.zip
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 20 (15 by maintainers)
@ricwo , Yes i faced the issue again when having a large dataset. After further investigation i found that the issue is also related to memory since i cannot reproduce the issue locally while the server has only ~700Mb available. I’ve added a 1Gb SWAP, and i did not encounter the error message again.
@marrouchi @znat @shashijeevan are you still experiencing this issue?
@ricwo you worked on fixing this – any ideas?