cvat: Gateway Timeout (504) error when running with SAM
My actions before raising this issue
- Read/searched the docs
- Searched past issues
Steps to Reproduce (for bugs)
-
Downloaded
cvat, am on commitad534b2ac32f57. -
Installed NVIDIA container toolkit.
-
Followed Serverless Setup steps.
-
Installed
nuctlby following guide here. Verified thatnucliois version1.8.14. -
Ran command to launch SAM nuctl function as described here.
cd serverless && ./deploy_gpu.sh pytorch/facebookresearch/sam/nuclio/ -
Checked that
nucliofunction is running properlynuctl get functionreturns that SAM function is in STATEready. -
Launched CVAT in serverless mode using
docker-compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d. -
Open CVAT task, select “Segment Anything” from AI tools, click on image. Get a “Waiting a response from Segment Anything.” After a while I get a 504 timeout error.
Failed to load resource: the server responded with a status of 504 (Gateway Timeout). Clicking on the link in the browser console shows me REST api call (image below).

Current Behaviour
It seems that CVAT instance is unable to communicate with the nuclio SAM function. I have verified that SAM function is running in nuclio dashboard.
Your Environment
- Git hash commit
ad534b2a: - Docker version
23.0.4 - Are you using Docker Swarm or Kubernetes? Regular docker
- Operating System and version (e.g. Linux, Windows, MacOS):. Linux 22.04
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 4
- Comments: 20 (5 by maintainers)
Commits related to this issue
- Running SAM backbone on frontend (#6019) <!-- Raise an issue to propose your change (https://github.com/opencv/cvat/issues). It helps to avoid duplication of efforts from multiple independent cont... — committed to cvat-ai/cvat by bsekachev a year ago
- Running SAM backbone on frontend (#6019) <!-- Raise an issue to propose your change (https://github.com/opencv/cvat/issues). It helps to avoid duplication of efforts from multiple independent cont... — committed to retailnext/cvat by bsekachev a year ago
- Setup nuclio timeout (#6840) See these several issues. They are connected to each other in some way. The thing is that nuclio has a default timeout of 1 minute. With this change we can force nucli... — committed to cvat-ai/cvat by PMazarovich 10 months ago
- Setup nuclio timeout (#6840) See these several issues. They are connected to each other in some way. The thing is that nuclio has a default timeout of 1 minute. With this change we can force nucli... — committed to retailnext/cvat by PMazarovich 10 months ago
@whom-da dawg, thanks for this comment. I finally got things working after struggling for weeks! Because I have openvpn and ssh servers installed on my machine, I enabled the firewall which is disabled in Ubuntu by default. And so I had to issue
sudo ufw allow 32772/tcp. A little more info is at the issue that I posted, #6087I’m having the exact same issue on cvat 2.4.3, both with SAM and YOLOv5.
Logs seem to indicate that the cvat server is not able to communicate with the serverless container (example for YOLOv5):
Logs of the YOLO container: (SAM is similar to previous comment ) :
Same issue with same logs… Not working with GPU or CPU.
For those still having “ERROR django.request: Service Unavailable” issues, I found yaochenglouis’s answer at #2641 worked for me. Was just a firewall issue - do “ufw allow” on the port the Nuclio function is listening on.
I’m also having the same issue, No “AI Tool”, either SAM interactor or YOLO detectors, are working on GPU or CPU on my desktop. However, they are working fine on my laptop, where I used identical installation steps. The laptop has lesser hardware in every respect than my desktop.
My desktop has openvpn and ssh servers installed. They aren’t running in docker. I don’t think this should affect anything, but I can’t think of why else my computer would differ significantly from any other.
The operation simply times out, with the error message shown in the following image
Here are logs from
nuclio-nuclio-pth.facebookresearch.sam.vit_hcontainer, on both my laptop and desktop. This one is the only one that looked significantly different between the desktop and laptop, at least to my eye. I can see two calls to the call handler on my laptop which correspond to the two times I used SAM. On my desktop, there are no lines corresponding to thecall handler. So it seems the container built fromnuclio-nuclio-pth.facebookresearch.sam.vit_his not communicating with the others properly?Desktop:
Laptop
Supplementary Info
To install, I followed these steps.
git clone.nuctlas described here, 2nd bullet. I created a project forcvatas described there, but I didn’t deploy any of the models.cd serverless && ./deploy_cpu.sh pytorch/facebookresearch/sam/nuclio/localhost:8080in Chrome, create a project and task, upload any image you’d like (I used this).htopwhile it is running, I see nothing happening after I click.docker version(e.g. Docker 17.0.05): 23.0.5I’ve got exactly the same issue. The weird this is that I got it to work on CPU on one computer and then followed the exact same steps on another and there it doesn’t work.
In the docker logs I’ve found this:
I found no further errors in the nuclio container and the sam container.
I’ve tested with another serverless function and there I got the same issue. Currently I run with the env CVAT_HOST, but I got the same behaviour without.
So I suspect that there might be some communication issues between the cvat containers, but I don’t really see a clear way to debug this.