jina: [Advise: The API call failed because the CUDA driver and runtime could not be initialized. ]
Describe your proposal/problem
Dear Jina Team,
Recently, I tried nlp-simple example and replaced TransformerTorchEncoder with TextPaddlehubEncoder. However, I got the following error.
Traceback (most recent call last):
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/peapods/runtimes/zmq/zed.py", line 73, in _load_executor
self._executor = BaseExecutor.load_config(
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/jaml/__init__.py", line 531, in load_config
return JAML.load(tag_yml, substitute=False)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/jaml/__init__.py", line 89, in load
r = yaml.load(stream, Loader=JinaLoader)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
return loader.get_single_data()
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
return self.construct_document(node)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/yaml/constructor.py", line 55, in construct_document
data = self.construct_object(node)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/yaml/constructor.py", line 100, in construct_object
data = constructor(self, node)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/jaml/__init__.py", line 422, in _from_yaml
return get_parser(cls, version=data.get('version', None)).parse(cls, data)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/jaml/parsers/executor/legacy.py", line 130, in parse
obj = cls(
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/executors/__init__.py", line 82, in __call__
getattr(obj, '_post_init_wrapper', lambda *x: None)(m, r)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/executors/__init__.py", line 174, in _post_init_wrapper
self.post_init()
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/jina/hub/encoders/nlp/TextPaddlehubEncoder/__init__.py", line 38, in post_init
self.model = hub.Module(name=self.model_name)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/module.py", line 171, in __new__
module = cls.init_with_name(
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/module.py", line 263, in init_with_name
user_module_cls = manager.install(name=name, version=version, source=source, update=update, branch=branch)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/manager.py", line 188, in install
return self._install_from_name(name, version)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/manager.py", line 263, in _install_from_name
return self._install_from_url(item['url'])
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/manager.py", line 256, in _install_from_url
return self._install_from_archive(file)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/manager.py", line 361, in _install_from_archive
return self._install_from_directory(directory)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/manager.py", line 345, in _install_from_directory
hub_module_cls = HubModule.load(self._get_normalized_path(module_info.name))
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddlehub/module/module.py", line 219, in load
paddle.set_device(place)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddle/device.py", line 166, in set_device
framework._set_expected_place(place)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddle/fluid/framework.py", line 317, in _set_expected_place
_set_dygraph_tracer_expected_place(place)
File "/home/work/renyuanhang/my_anaconda3/envs/jina_rocket2/lib/python3.8/site-packages/paddle/fluid/framework.py", line 311, in _set_dygraph_tracer_expected_place
_dygraph_tracer_._expected_place = place
OSError: (External) Cuda error(3), initialization error.
[Advise: The API call failed because the CUDA driver and runtime could not be initialized. ] (at /paddle/paddle/fluid/platform/gpu_info.cc:229)
It seems to be related with https://github.com/PaddlePaddle/Paddle/issues/25185.
However, I have no idea how to fix this. Could you help me to figure it out? Thank you very much!
Environment
paddlehub==2.0.0 paddlenlp==2.0.1 paddlepaddle-gpu==2.0.2.post90 jina==1.3.0
Screenshots

About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 47 (28 by maintainers)
hi @ryh95 i looked into the issue, seems it’s not only related to Paddle, but also Pytorch, look at this thread. Basically, Jina use multithreading with
forkas start method, while CUDA does not support it well, and for Jina we have to stick withforkas the default starting method.As far as I know, serving your solution without enabling GPU is the easier way to make it work.
Hi, @alexcg1. I tried your suggestions, namely, I started from the
wikipedia-sentencesexample and replaced theTransformerTorchEncoderwith theTextPaddlehubEncoder. However, I still encountered the aforementionedCuda error(3)with Jina 1.3Maybe this problem is related to multiprocessing as this issue points out?