server: [Solved] Bug: socket.timeout: timed out. Server failed to respond to requests
Hello,
I have been experimenting on triton inference server. I found the server sometime failed to respond to requests. The client keeps raising socket.timeout: timed out error even if I try to catch the InferenceServerException and resend the request for up to 10 times.
What can I do to fix this?
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/cwman/work/repos/jumper/services/triton/pc_det/dispatcher/1/model.py", line 61, in infer_mp
processor.run_mp_triton(thid, numworkers)
File "/home/cwman/work/repos/jumper/jumper/processor/pc_det_processor/core.py", line 558, in run_mp_triton
err_code = self.process_single(scan_stamp)
File "/home/cwman/work/repos/jumper/jumper/processor/pc_det_processor/core.py", line 518, in process_single
scan_data, boxes_lidar, label_boxes, predicted_scores = self.process_single_core(scan_stamp)
File "/home/cwman/work/repos/jumper/jumper/processor/pc_det_processor/core.py", line 410, in process_single_core
boxes_lidar,label_boxes, predicted_scores = self._model.model_infer(scan_data)
File "/home/cwman/work/repos/jumper/services/triton/pc_det/dispatcher/1/model.py", line 47, in model_infer
outs = super().request([pts[None], tta])['OUTPUT']
File "/home/cwman/work/repos/jumper/services/triton/client_base.py", line 60, in request
response = self.infer(self.model_name,
File "/usr/local/lib/python3.8/dist-packages/tritonclient/http/__init__.py", line 1414, in infer
response = self._post(request_uri=request_uri,
File "/usr/local/lib/python3.8/dist-packages/tritonclient/http/__init__.py", line 309, in _post
response = self._client_stub.post(request_uri=request_uri,
File "/usr/local/lib/python3.8/dist-packages/geventhttpclient/client.py", line 272, in post
return self.request(METHOD_POST, request_uri, body=body, headers=headers)
File "/usr/local/lib/python3.8/dist-packages/geventhttpclient/client.py", line 253, in request
response = HTTPSocketPoolResponse(sock, self._connection_pool,
File "/usr/local/lib/python3.8/dist-packages/geventhttpclient/response.py", line 298, in __init__
super(HTTPSocketPoolResponse, self).__init__(sock, **kw)
File "/usr/local/lib/python3.8/dist-packages/geventhttpclient/response.py", line 170, in __init__
self._read_headers()
File "/usr/local/lib/python3.8/dist-packages/geventhttpclient/response.py", line 190, in _read_headers
data = self._sock.recv(self.block_size)
File "/usr/local/lib/python3.8/dist-packages/gevent/_socketcommon.py", line 663, in recv
self._wait(self._read_event)
File "src/gevent/_hub_primitives.py", line 317, in gevent._gevent_c_hub_primitives.wait_on_socket
File "src/gevent/_hub_primitives.py", line 322, in gevent._gevent_c_hub_primitives.wait_on_socket
File "src/gevent/_hub_primitives.py", line 313, in gevent._gevent_c_hub_primitives._primitive_wait
File "src/gevent/_hub_primitives.py", line 314, in gevent._gevent_c_hub_primitives._primitive_wait
File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_hub_primitives.py", line 55, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
File "src/gevent/_waiter.py", line 154, in gevent._gevent_c_waiter.Waiter.get
File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
socket.timeout: timed out
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (5 by maintainers)
After reviewing the previous discussion, it seemed that there were no solutions provided. However, upon further investigation, I discovered that the
network_timeoutparameter was what I was looking for. Adjusting this parameter successfully solved my issue.Example:
Hi @jbkyang-nvi, ok I will try and let you know as soon as possible. Thanks for the support!