server: Python InferenceServerClient (http) should not call close() from __del__

Description The python http InferenceServerClient implements the __del__ finalizer and calls self.close(). The issue is that this uses gevent which must always be called from the same thread. Python can call __del__ from any thread. So when the client is finalized you’ll get errors like this:

Exception ignored in: <function InferenceServerClient.__del__ at 0x7fdf0d0c7a60>
Traceback (most recent call last):
"File ""/opt/conda/lib/python3.8/site-packages/tritonclient/http/__init__.py"", line 226, in __del__"
self.close()
"File ""/opt/conda/lib/python3.8/site-packages/tritonclient/http/__init__.py"", line 233, in close"
self._pool.join()
"File ""/opt/conda/lib/python3.8/site-packages/gevent/pool.py"", line 430, in join"
result = self._empty_event.wait(timeout=timeout)
"File ""src/gevent/event.py"", line 163, in gevent._gevent_cevent.Event.wait"
"File ""src/gevent/_abstract_linkable.py"", line 509, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait"
"File ""src/gevent/_abstract_linkable.py"", line 206, in gevent._gevent_c_abstract_linkable.AbstractLinkable._capture_hub"
"gevent.exceptions.InvalidThreadUseError: (<Hub '' at 0x7fde6e5f6580 epoll default pending=0 ref=0 fileno=5 resolver=<gevent.resolver.thread.Resolver at 0x7fdf44ba2190 pool=<ThreadPool at 0x7fde6e552be0 tasks=0 size=1 maxsize=10 hub=<Hub at 0x7fde6e5f6580 thread_ident=0x7fdf4688c740>>> threadpool=<ThreadPool at 0x7fde6e552be0 tasks=0 size=1 maxsize=10 hub=<Hub at 0x7fde6e5f6580 thread_ident=0x7fdf4688c740>> thread_ident=0x7fdf4688c740>, <Hub '' at 0x7fde6e4fc7c0 epoll pending=0 ref=0 fileno=19 thread_ident=0x7fddae7ed700>, <greenlet.greenlet object at 0x7fdf3f1f8b40 (otid=0x7fdf3f20f680) current active started main>)"

I am already explicitly closeing the client in the correct thread but InferenceServerClient does not realize it’s already closed and so attempts to close it again.

Triton Information tritonclient[http] 2.19.0

To Reproduce Construct an InferenceServerClient in a multithreaded python app.

But it does require Python to call __del__ from a different thread which can’t be controlled.

Expected behavior To not get these spurious warnings in our logs.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 7
  • Comments: 20 (8 by maintainers)

Most upvoted comments

Just for reference, I see the same message Exception ignored in: <function InferenceServerClient.__del__ at 0x7fdf0d0c7a60> although with a slightly different traceback

Exception ignored in: <function InferenceServerClient.__del__ at 0x7f7a9ff9edd0>
Traceback (most recent call last):
  File "/home/ivergara/mambaforge/envs/demo/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 273, in __del__
  File "/home/ivergara/mambaforge/envs/demo/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 281, in close
  File "/home/ivergara/mambaforge/envs/demo/lib/python3.10/site-packages/grpc/_channel.py", line 1564, in close
  File "/home/ivergara/mambaforge/envs/demo/lib/python3.10/site-packages/grpc/_channel.py", line 1548, in _close
AttributeError: 'NoneType' object has no attribute 'StatusCode'

The context where I get this is during test execution via pytest of a FastAPI application where a grpcclient.InferenceServerClient is attached to the application. I haven’t seen this message when using the web application by itself, only when executing the test suite.

I solved this problem by declaring CLIENT inside the function e.g.

def predict_batchsize(inputs, model_name=‘building’, batchsize=64, inp_desc=(“INPUT__0”, “FP32”), otp_desc=(“OUTPUT__0”, “FP32”)): CLIENT = grpc_client.InferenceServerClient(url=“192.168.128.29:8001”) … preds = CLIENT.infer(model_name=model_name, inputs=[inp], outputs=[otp]).as_numpy(otp_desc[0])