simpletransformers: ClassificationModel: predict() hangs forever in uwsgi worker
Describe the bug
When model.predict is invoked in a uwsgi worker, it never resolves (hangs on the line outputs = model(**inputs) )
To Reproduce Steps to reproduce the behavior:
- Train a roberta-base model with simpletransformers 0.48.9
- Run a uwsgi + flask server that loads the model with
{"use_multiprocessing": False}before spawning workers, and then runsmodel.predict()when it receives a request (I used the docker image tiangolo/uwsgi-nginx-flask as a base, and install transformers, pytorch and simpletransformers) - Emit a request, it hangs on the line
outputs = model(**inputs) - However, if
model.predict()is called on the same server before the uwsgi workers are spawn (when the server loads, as opposed to when responding to a request), it returns normally with the expected result. - Another way for
predict()to return normally is to load the model inside each worker, meaning the first request handled by each worker is delayed by the loading of the model.
Desktop (please complete the following information):
- Docker image with Debian Buster + python 3.8 + flask + nginx + uwsgi
- transformers version 3.3.1
- simpletransformers version 0.48.9
- torch version 1.6.0
- uwsgi: tested with versions 2.0.17, 2.0.18, 2.0.19, 2.0.19.1
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (4 by maintainers)
I had the same problem and now I solved it.
My args dict is like below.
args={"use_multiprocessing": False, "use_multiprocessing_for_evaluation": False, "process_count": 1}Setting
use_multiprocessing=Falseshould fix it.@jmeisele I use uwsgi (wsgi).
To delay the model loading into the worker you can use a singleton:
classifier.py(with a very basic lazy singleton):main.pyfile, referenced inuwsgi.ini:But I am still unsure if this is the proper way to load and use the model.
@ThilinaRajapakse There is an issue in your snippet:
If I run that I get
TypeError: 'ClassificationModel' object is not callable.I looked at the code of
ClassificationModel.predictand it callsthis.model(**inputs)so I instead ranoutputs = model.model(**inputs)And it still hangs the same way on
model.model(**inputs)when the model is loaded before the workers are spawned, and prediction_test is called from a worker.For now, we’ve updated the server so it loads the model in each worker (last point of my initial message) which means that the first request of a worker after its spawned is always slower. Is that the recommended approach ?