cvat: Could not get models from the server
Actions before raising this issue
- I searched the existing issues and did not find anything similar.
- I read/searched the docs
Steps to Reproduce
Hello, Im trying to deploy the YOLOv5 with nucilo to use it with CVAT in order to do auto annotation. I tried to deploy Yolov5, same as the YoloV7 ONNX which already exist in the Cvat documentation. I have managed to successfully deploy the YoloV5 onnx with their original weight file (yolov5l.onnx). However when I try to deploy with my custom weight file, the CVAT or Nuclio cannot find the weight file.
Could not get models from the server
I have checked inside the container and my weight file is there and Im sure is nothing wrong with my onnx file since I did the test outside of docker and works fine. I have also use the docker logs to see if there is any error but no error.
This is my function.yaml
metadata:
name: pth-ultralytics-yolov5-anfp
namespace: cvat
annotations:
name: YOLO v5
type: detector
framework: onnx
spec: |
[
{ "id": 0, "name": "Adidas", "type": "rectangle" },
]
spec:
description: YOLO v5
runtime: 'python:3.8'
handler: main:handler
eventTimeout: 30s
build:
image: cvat.onnx.ultralytics.yolov5-anfp
baseImage: ubuntu:20.04
directives:
preCopy:
- kind: USER
value: root
- kind: RUN
value: apt update && apt install --no-install-recommends -y wget python3-pip
- kind: RUN
value: apt update && apt install --no-install-recommends -y libglib2.0-0
- kind: RUN
value: apt-get update && apt-get install ffmpeg libsm6 libxext6 -y
- kind: RUN
value: pip install onnxruntime opencv-python-headless pillow pyyaml torch torchvision numpy onnx onnx-simplifier tqdm scipy gitpython matplotlib
- kind: WORKDIR
value: /opt/nuclio
- kind: RUN
value: wget xx.onnx
- kind: RUN
value: ln -s /usr/bin/python3 /usr/bin/python
triggers:
myHttpTrigger:
maxWorkers: 2
kind: 'http'
workerAvailabilityTimeoutMilliseconds: 10000
attributes:
maxRequestBodySize: 33554432 # 32MB
platform:
attributes:
restartPolicy:
name: always
maximumRetryCount: 3
mountMode: volume
Here is my main file:
import base64
import io
import json
import yaml
from model_handler import ModelHandler
from PIL import Image
def init_context(context):
context.logger.info("Init context... 0%")
# Read the DL model
model = ModelHandler()
context.user_data.model = model
context.logger.info("Init context...100%")
def handler(context, event):
context.logger.info("Run YoloV5 ONNX model")
data = event.body
buf = io.BytesIO(base64.b64decode(data["image"]))
image = Image.open(buf)
results = context.user_data.model.infer(image, 0.5)
return context.Response(body=json.dumps(results), headers={},
content_type='application/json', status_code=200)
Here is a part of my model handler
import cv2
import numpy as np
import onnxruntime as ort
import torch
import torchvision
class ModelHandler:
def __init__(self):
self.is_inititated = None
self.names = None
self.stride = None
self.meta = None
self.output_names = None
self.session = None
self.load_network(model="xxyolov5l.onnx")
def load_network(self, model):
device = ort.get_device()
cuda = True if device == 'GPU' else False
try:
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
so = ort.SessionOptions()
so.log_severity_level = 3
self.session = ort.InferenceSession(model, providers=providers, sess_options=so)
self.output_names = [x.name for x in self.session.get_outputs()]
self.meta = self.session.get_modelmeta().custom_metadata_map
# self.input_details = [i.name for i in self.session.get_inputs()]
if "stride" in self.meta:
self.stride, self.names = int(self.meta["stride"]), eval(self.meta["names"])
self.is_inititated = True
except Exception as e:
raise Exception(f"HIYOOOO Cannot load model {model}: {e}")
def letterbox(self, im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleup=True, stride=32):
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, r, (dw, dh)
def xywh2xyxy(self, x):
# Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
y[:, 0] = x[:, 0] - x[:, 2] / 2 # top left x
y[:, 1] = x[:, 1] - x[:, 3] / 2 # top left y
y[:, 2] = x[:, 0] + x[:, 2] / 2 # bottom right x
y[:, 3] = x[:, 1] + x[:, 3] / 2 # bottom right y
return y
def box_area(self, box):
# box = xyxy(4,n)
return (box[2] - box[0]) * (box[3] - box[1])
def box_iou(self, box1, box2, eps=1e-7):
# inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
(a1, a2), (b1, b2) = box1[:, None].chunk(2, 2), box2.chunk(2, 1)
inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp(0).prod(2)
# IoU = inter / (area1 + area2 - inter)
return inter / (self.box_area(box1.T)[:, None] + self.box_area(box2.T) - inter + eps)
def scale_coords(self, img1_shape, coords, img0_shape, ratio_pad=None):
# Rescale coords (xyxy) from img1_shape to img0_shape
if ratio_pad is None: # calculate from img0_shape
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]) # gain = old / new
pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2]] -= pad[0] # x padding
coords[:, [1, 3]] -= pad[1] # y padding
coords[:, :4] /= gain
self.clip_coords(coords, img0_shape)
return coords
def clip_coords(self, boxes, shape):
# Clip bounding xyxy bounding boxes to image shape (height, width)
if isinstance(boxes, torch.Tensor): # faster individually
boxes[:, 0].clamp_(0, shape[1]) # x1
boxes[:, 1].clamp_(0, shape[0]) # y1
boxes[:, 2].clamp_(0, shape[1]) # x2
boxes[:, 3].clamp_(0, shape[0]) # y2
else: # np.array (faster grouped)
boxes[:, [0, 2]] = boxes[:, [0, 2]].clip(0, shape[1]) # x1, x2
boxes[:, [1, 3]] = boxes[:, [1, 3]].clip(0, shape[0]) # y1, y2
def non_max_suppression(self, prediction,
conf_thres=0.25,
iou_thres=0.45,
agnostic=False,
max_det=300):
bs = prediction.shape[0] # batch size
xc = prediction[..., 4] > conf_thres # candidates
# Settings
# min_wh = 2 # (pixels) minimum box width and height
max_wh = 7680 # (pixels) maximum box width and height
max_nms = 30000 # maximum number of boxes into torchvision.ops.nms()
redundant = True # require redundant detections
merge = False # use merge-NMS
output = [torch.zeros((0, 6), device=prediction.device)] * bs
for xi, x in enumerate(prediction): # image index, image inference
# Apply constraints
# x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0 # width-height
x = x[xc[xi]] # confidence
# If none remain process next image
if not x.shape[0]:
continue
# Compute conf
x[:, 5:] *= x[:, 4:5] # conf = obj_conf * cls_conf
# Box (center x, center y, width, height) to (x1, y1, x2, y2)
box = self.xywh2xyxy(x[:, :4])
# Detections matrix nx6 (xyxy, conf, cls)
conf, j = x[:, 5:].max(1, keepdim=True)
x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]
# Apply finite constraint
# if not torch.isfinite(x).all():
# x = x[torch.isfinite(x).all(1)]
# Check shape
n = x.shape[0] # number of boxes
if not n: # no boxes
continue
elif n > max_nms: # excess boxes
x = x[x[:, 4].argsort(descending=True)[:max_nms]] # sort by confidence
# Batched NMS
c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
if i.shape[0] > max_det: # limit detections
i = i[:max_det]
if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean)
# update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
iou = self.box_iou(boxes[i], boxes) > iou_thres # iou matrix
weights = iou * scores[None] # box weights
x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True) # merged boxes
if redundant:
i = i[iou.sum(1) > 1] # require redundancy
output[xi] = x[i]
return output
def _infer(self, inputs: np.ndarray):
try:
img = cv2.cvtColor(inputs, cv2.COLOR_BGR2RGB)
image = img.copy()
image, ratio, dwdh = self.letterbox(image, auto=False)
image = image.transpose((2, 0, 1))
image = np.expand_dims(image, 0)
image = np.ascontiguousarray(image)
im = image.astype(np.float16)
im /= 255
#im = im.astype(np.float16)
# ONNX inference
output = list()
bboxs, confs, clss = [], [], []
detections = self.session.run([self.session.get_outputs()[0].name], {self.session.get_inputs()[0].name: im})[0]
detections = torch.from_numpy(detections).to(torch.device('cpu'))
pred = self.non_max_suppression(detections, conf_thres=0.25, iou_thres=0.45, agnostic=False, max_det=1000)
print('pred: ', pred)
for i, det in enumerate(pred):
det[:, :4] = self.scale_coords(im.shape[2:], det[:, :4], img.shape).round()
for *xyxy, conf, cls in reversed(det):
c = int(cls)
#labels = self.names[c]
x1, y1, x2, y2 = [int(x.cpu().numpy()) for x in xyxy]
bboxs.append([x1, y1, x2, y2])
clss.append(self.names[c])
confs.append(conf.cpu().numpy())
output = [bboxs, clss, confs]
return output
except Exception as e:
print(e)
def infer(self, image):
image = np.array(image)
image = image[:, :, ::-1].copy()
h, w, _ = image.shape
detections = self._infer(image)
results = []
if detections:
boxes = detections[0]
labels = detections[1]
scores = detections[2]
for label, score, box in zip(labels, scores, boxes):
results.append({
"confidence": str(score),
"label": label,
"points": box,
"type": "rectangle",
})
return results
Expected Behavior
I expect the code works with custom weight model same as original model since I just changed the class labels and the weight file name. the rest of the code is same. I cannot find out why the Nuclio cannot find the my weight file which is in the same location of the original weight file. Any help would be appreciate to understand why my code is working perfectly with original weight file and not custom weight file.
Thanks
Possible Solution
No response
Context
No response
Environment
No response
About this issue
- Original URL
- State: closed
- Created 4 months ago
- Comments: 15 (2 by maintainers)
You told, that:
So, if you 100% sure in this, it only means that the file is missing on the specified path. Maybe it is in another directory, have another name, but it was not found. I do not expect any magic here. Look at many other models in
serverlessdirectory. The same logic in all of them.Hii @Auth0rM0rgan Does the issue persist?
I got it solved, for my case. Maybe something went wrong when deploying a new function through nuctl. To solve it, remove the directory that was created at /etc/nuclio/store/functions/nuclio in the nuclio-local-storage-reader container.
@PrashantDixit0 man pls read the issue first!!! I can already deploy Yolov5 with onnx model successfully but the issue is my own custom pretrained model which the CVAT cannot find and only work with the original onnx model.
My Issue Resolved 👍
In my local deployment, I can’t see option of model in Navbar