cvat: Detection error occurred Error: Request failed with status code 503. "HTTPConnectionPool(host='host.docker.internal', port=32938): Max retries exceeded with url: / (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))".

I tried to deploy an automatic annotation function with the YOLOv5 model that I trained myself. I made modifications to the function-gpu.yaml and main.py files.

Here are the changes: for main.py model = torch.hub.load('ultralytics/yolov5', 'yolov5s') to model = torch.hub.load('/home/being/Disk_2T/kyle_work/yolov5-v6.0-haice', 'custom', '/home/being/Disk_2T/kyle_work/yolov5-v6.0-haice/runs/train/exp78-20230518best/weights/best.pt', source='local') for function-gpu.yaml, changed spec to spec: | [ { "id": 0, "name": "40_covered" }, { "id": 1, "name": "40_uncovered" }, { "id": 2, "name": "20_covered" }, { "id": 3, "name": "20_uncovered" } ]

Then, I ran the command: ./deploy_gpu.sh serverless/pytorch/haice/yolov5/nuclio to activate the model.

However, when I tried to use the automatic annotation in CVAT, I encountered the following error: Detection error occurred Error: Request failed with status code 503. “HTTPConnectionPool(host=‘host.docker.internal’, port=32938): Max retries exceeded with url: / (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7fe2d8806c50>: Failed to establish a new connection: [Errno 111] Connection refused’))”.

docker logs:

23.06.02 07:32:20.520 cessor.healthcheck.server (I) Listening {“listenAddress”: “:8082”} 23.06.02 07:32:20.520 processor.http (D) Creating worker pool {“num”: 1} 23.06.02 07:32:20.520 sor.http.w0.python.logger (D) Creating listener socket {“path”: “/tmp/nuclio-rpc-chspm12r86jtblth86c0.sock”} 23.06.02 07:32:20.521 sor.http.w0.python.logger (W) Python 3.6 runtime is deprecated and will soon not be supported. Please migrate your code and use Python 3.7 runtime (python:3.7) or higher 23.06.02 07:32:20.521 sor.http.w0.python.logger (D) Using Python wrapper script path {“path”: “/opt/nuclio/_nuclio_wrapper.py”} 23.06.02 07:32:20.521 sor.http.w0.python.logger (D) Using Python handler {“handler”: “main:handler”} 23.06.02 07:32:20.521 sor.http.w0.python.logger (D) Using Python executable {“path”: “/opt/conda/bin/python3”} 23.06.02 07:32:20.521 sor.http.w0.python.logger (D) Setting PYTHONPATH {“value”: “PYTHONPATH=/opt/nuclio”} 23.06.02 07:32:20.521 sor.http.w0.python.logger (D) Running wrapper {“command”: “/opt/conda/bin/python3 -u /opt/nuclio/_nuclio_wrapper.py --handler main:handler --socket-path /tmp/nuclio-rpc-chspm12r86jtblth86c0.sock --platform-kind local --namespace nuclio --worker-id 0 --trigger-kind http --trigger-name myHttpTrigger --decode-event-strings”} /opt/nuclio/_nuclio_wrapper.py:395: DeprecationWarning: There is no current event loop loop = asyncio.get_event_loop() 23.06.02 07:32:22.428 sor.http.w0.python.logger (I) Wrapper connected {“wid”: 0, “pid”: 20} 23.06.02 07:32:22.429 sor.http.w0.python.logger (D) Waiting for start {“datetime”: “2023-06-02 07:32:22,429”, “level”: “info”, “message”: “Replacing logger output”, “with”: {“handler_name”: “default”, “worker_id”: “0”}} 23.06.02 07:32:22.429 sor.http.w0.python.logger (I) Init context… 0% {“worker_id”: “0”} 23.06.02 07:32:22.430 sor.http.w0.python.logger (E) Exception raised while running init_context {“worker_id”: “0”} Traceback (most recent call last): File “/opt/nuclio/_nuclio_wrapper.py”, line 432, in <module> run_wrapper() File “/opt/nuclio/_nuclio_wrapper.py”, line 420, in run_wrapper loop.run_until_complete(wrapper_instance.initialize()) File “/opt/conda/lib/python3.10/asyncio/base_events.py”, line 649, in run_until_complete return future.result() File “/opt/nuclio/_nuclio_wrapper.py”, line 147, in initialize await self._initialize_context() File “/opt/nuclio/_nuclio_wrapper.py”, line 158, in _initialize_context init_context_result = getattr(self._entrypoint_module, ‘init_context’)(self._context) File “/opt/nuclio/main.py”, line 12, in init_context model = torch.hub.load(‘/home/being/Disk_2T/kyle_work/yolov5-v6.0-haice’,‘custom’, ‘/home/being/Disk_4T/cvat/serverless/pytorch/haice/yolov5/nuclio/best.pt’,source=‘local’,force_reload=True) # or yolov5m, yolov5l, yolov5x, custom File “/opt/conda/lib/python3.10/site-packages/torch/hub.py”, line 558, in load model = _load_local(repo_or_dir, model, *args, **kwargs) File “/opt/conda/lib/python3.10/site-packages/torch/hub.py”, line 584, in _load_local hub_module = _import_module(MODULE_HUBCONF, hubconf_path) File “/opt/conda/lib/python3.10/site-packages/torch/hub.py”, line 98, in _import_module spec.loader.exec_module(module) File “<frozen importlib._bootstrap_external>”, line 879, in exec_module File “<frozen importlib._bootstrap_external>”, line 1016, in get_code File “<frozen importlib._bootstrap_external>”, line 1073, in get_data FileNotFoundError: [Errno 2] No such file or directory: ‘/home/being/Disk_2T/kyle_work/yolov5-v6.0-haice/hubconf.py’ 23.06.02 07:32:22.972 sor.http.w0.python.logger (W) Failed to read from connection {“err”: “EOF”} 23.06.02 07:32:22.972 sor.http.w0.python.logger (E) Unexpected termination of child process {“error”: null, “status”: “exit status 1”} panic: Wrapper process for worker 0 exited unexpectedly with: exit status 1

goroutine 59 [running]: github.com/nuclio/nuclio/pkg/processor/runtime/rpc.(*AbstractRuntime).watchWrapperProcess(0xc000378cc0) /nuclio/pkg/processor/runtime/rpc/abstract.go:471 +0x445 created by github.com/nuclio/nuclio/pkg/processor/runtime/rpc.(*AbstractRuntime).startWrapper /nuclio/pkg/processor/runtime/rpc/abstract.go:244 +0x1c5

Could anyone help me with this?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 22

Most upvoted comments

Which guide you’re following? I guess you missed some steps for the successful deploying. I followed this tutorial with the official guide and was able to deploy the models successfully.

https://youtu.be/-rDlAJWiJlE?si=W9F-SQgkllqt6Z_z

您哪里有这个视频吗,能不能给我发一份啊

您遵循哪个指南?我猜你错过了成功部署的一些步骤。

https://opencv.github.io/cvat/docs/administration/advanced/installation_automatic_annotation/ 我是参考这个文档部署的,就部署函数不适用.sh,是用nuctl deploy --project-name cvat --path serverless/pytorch/ultralytics/yolov5/nuclio/ --volume pwd/serverless/common:/opt/nuclio/common --platform local 我把自己模型的函数删掉就正常了,所以还是自己模型部署的函数有问题,但是不知道问题出在哪里

I am having a similar problem with the Segment Anything interactor model. Your port number in the error message is 32938. If you issue docker ps and see the ports column for your nuclio function, is that the port that appears there?

For me, the two ports do not match. For some reason it is looking for the function at port 32769 when the SAM model is at 32768. If I issue docker restart nuclio-nuclio-pth-facebookresearch-sam-vit-h, then when it restarts, the port number increments by 1, and then I can use the model in CVAT without that error. But this doesn’t make sense to me, as I haven’t hard coded any port number anywhere, as far as I am aware. So I don’t know why it isn’t looking for the function in the right place.

If I grep for the port number in the git repo, the only file that has 32768 is

cvat$ find . -name "*.*" -type f -exec grep -l 32768 {} \+
./cvat/apps/opencv/static/opencv/js/opencv_4.5.4_dev.js

Not sure if this is any help. But I do notice that when it finally does work, it briefly has a message pop up saying something about opencv initialization being successful. There is another recent issue regarding opencv initialization, so maybe there’s something fishy going on there? Don’t really know.

I checked port numbers, they match. I think my issue is probably caused by my model itself, maybe the docker image could not load it correctly. I tried the yolov5 demo provided by cvat, it works.