PINTO_model_zoo: Movenet: error on loading model with Openvino

1. OS Ubuntu 18.04

2. OS Architecture x86_64

3. Version of OpenVINO 2021.3.394

9. Movenet from your model zoo

Ha ha it’s me again 😉 I saw you have already converted Movenet ! Naturally I wanted to give it a try. I get this error message when loading the ‘lightning’ (or ‘thunder’) model:

openvino@ubuntu:/workdir$ python3 MovenetOpenvino.py -m lightning
Video FPS: 30
Loading Inference Engine
Device info:
        CPU
        MKLDNNPlugin version ......... 2.1
        Build ........... 2021.3.0-2787-60059f2c755-releases/2021/3
Pose Detection model - Reading network files:
        /workdir/models/movenet_lightning_FP32.xml
        /workdir/models/movenet_lightning_FP32.bin
Traceback (most recent call last):
  File "MovenetOpenvino.py", line 569, in <module>
    output=args.output)
  File "MovenetOpenvino.py", line 99, in __init__
    self.load_model(xml, device)
  File "MovenetOpenvino.py", line 131, in load_model
    self.pd_net = self.ie.read_network(model=xml_path, weights=bin_path)
  File "ie_api.pyx", line 293, in openvino.inference_engine.ie_api.IECore.read_network
  File "ie_api.pyx", line 315, in openvino.inference_engine.ie_api.IECore.read_network
RuntimeError: Check 'element::Type::merge(inputs_et, inputs_et, get_input_element_type(i))' failed at core/src/op/concat.cpp:62:
While validating node 'v0::Concat Concat_1866 (stack_2_StatefulPartitionedCall/stack_2_1/Unsqueeze/Output_0/Data__const[0]:i32{1,1}, stack_2_StatefulPartitionedCall/stack_2_1/Unsqueeze503[0]:i64{1,1}, stack_2_StatefulPartitionedCall/stack_2_1/Unsqueeze505[0]:i64{1,1}) -> ()' with friendly_name 'Concat_1866':
Argument element types are inconsistent.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 88 (37 by maintainers)

Most upvoted comments

I see. I understood everything.

Done. Screenshot 2021-05-19 00:22:17

Hi @gespona, @PINTO0309 has done the conversion of the models but there is still a small problem with the output values. PINTO is working on it. Just be patient.

Your trick is even smarter than what I initially thought. So we were lucky there was already a transpose layer with the same transpose order in the model. For thunder, I didn’t even need to change the layers id (same as for lightning).

Ah it was not obvious 😃

Good night Katsuya !

Ah ah now I need to compare the 2 xml files to understand your modification. And I will try myself on thunder 😃

Sorry for the delay. It works ! You are a genius ! (I already know you are, it is just a confirmation)

I would choose the Pattern1.From my understanding, the depthai framework is able to automatically translate [C, H, W] (= output of ImageManip node) to [1, C, H, W] (=input of NeuralNet node).

I forced UINT8 and now much better. Starting to get the landmarks. Will share some results later. Thanks.

@gespona If you don’t have the error anymore, you are probably close to make it work. But I had a second thought during the lunch 😃 If my understanding is correct, what type of input (UINT8 or FP16) the neural net is waiting for depends on how the blob was compiled. @PINTO0309 has put the command used in a post above:

${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/lib/intel64/myriad_compile \
-m openvino/FP16/movenet_singlepose_lightning_3.xml \
-VPU_NUMBER_OF_SHAVES 4 \
-VPU_NUMBER_OF_CMX_SLICES 4 \
-o openvino/myriad/movenet_singlepose_lightning_3.blob

Because we don’t specify explicitly the type, I guess the compile tool uses the type used in the IR (FP16). We should force UINT8 by adding -ip U8 in the command above. This will make our life easier (no need of setFp16()) and when we will transmit images from the host to the OAK, we will transmit twice less data on the USB. I cannot test it now because I have to leave for a few hours. But I will test later on. Anyway, are you on the luxonis discord ? We should continue the discussion there to not pollute this thread .

@PINTO0309 FYI, I have created a repo for the openvino version (WIP) : https://github.com/geaxgx/openvino_movenet Thank you.

The image is padded with black stripes on top and bottom to make it square.

This image is not a pose that a normal human being can take, so it is no wonder that it is wrong. haha.

Yes. That’s right. There are advantages and disadvantages, but it is less work. As long as the file name is the same, Google Drive will not change the URL.

Don’t worry about it. OpenVINO has always had a lot of bugs. 😄 The recommitment is finished.

Thank you very much ! And sorry again for me using on old version of openvino.

😿 OK, I’ll re-commit the model with only the align_corner modified.

Ligthning with Openvino on CPU: image

Thunder with openvino on Myriad: image The skeleton is shifted.

Thunder with openvino on CPU : image

@gespona Earlier in this issue, I mentioned that I changed the type of the input from INT32 to FLOAT32 to quantize the model. Use Netron, a web site that allows you to visualize the structure of your model, to see the structure expected of the input. I am not very familiar with what the error message means, but the caveat is that I am converting the Float16 (FP16) model to Blob. I made a simple prediction because the offset value in the error message is double the value. A float16 precision model tagged with model_float32. I have customized the model to be optimized for Float32 and Float16, INT8, so please ignore the description on the model card. INT32 is an unwieldy and unfriendly model for many users of the model. https://netron.app/

openvino/FP16/movenet_singlepose_lightning_3.xml

Screenshot 2021-05-19 08:51:54

openvino/FP16/movenet_singlepose_thunder_3.xml

Screenshot 2021-05-19 08:52:28

The conversion command I used is below. I specified FP16 for the conversion, but will the result be the same if I reconvert using FP32? If there is a problem, it is a problem beyond my control.

${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/lib/intel64/myriad_compile \
-m openvino/FP16/movenet_singlepose_lightning_3.xml \
-VPU_NUMBER_OF_SHAVES 4 \
-VPU_NUMBER_OF_CMX_SLICES 4 \
-o openvino/myriad/movenet_singlepose_lightning_3.blob

@geaxgx I downloaded the latest Google Drive file again this morning, which I thought I had uploaded last night, and tested it again in a separate working folder to make sure there were no mistakes.

#### lightning
$ sudo gdown --id 1Fkh3N5fhyvrkWBF-9X7FhsN6YveFFu_O
$ tar -zxvf resources.tar.gz
$ mv resources resources1

#### thunder
$ sudo gdown --id 1RDF35KcL7kWRb4dgRf0OudH6l0EtZ3qw
$ tar -zxvf resources.tar.gz
$ mv resources resources2
  • test.png test

  • test_onnx.py

import onnx
import onnxruntime
import numpy as np
import cv2
import pprint

model_path = "model_float32.onnx"

def main():
    model = onnx.load(model_path)
    onnx.checker.check_model(model)

    sess = onnxruntime.InferenceSession(model_path)

    image = cv2.imread('test.png')
    frame = cv2.resize(image, (256, 256))
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    frame = np.expand_dims(frame, axis=0)
    frame = frame.astype('float32')

    inputs = {sess.get_inputs()[0].name: frame}
    outputs = sess.run(None, inputs)

    pprint.pprint(outputs)

if __name__ == "__main__":
    main()
[array([[[[0.32433048, 0.69931823, 0.8797674 ],
         [0.3375441 , 0.71945375, 0.7601149 ],
         [0.32876045, 0.7161203 , 0.5944811 ],
         [0.39609817, 0.7126299 , 0.82234097],
         [0.37236294, 0.70911103, 0.8054341 ],
         [0.48610994, 0.6517349 , 0.85130155],
         [0.35770214, 0.60963666, 0.8522695 ],
         [0.60724556, 0.6535827 , 0.4368128 ],
         [0.2557739 , 0.7068589 , 0.61313367],
         [0.6248696 , 0.6063015 , 0.55172634],
         [0.18469065, 0.78301877, 0.52691185],
         [0.61794114, 0.5086788 , 0.9210715 ],
         [0.58055407, 0.41669005, 0.8049598 ],
         [0.67506   , 0.7038542 , 0.91628826],
         [0.7518556 , 0.30507305, 0.834178  ],
         [0.88219225, 0.7037958 , 0.90368426],
         [0.88579935, 0.20647278, 0.9427309 ]]]], dtype=float32)]
  • test_openvino.py
from openvino.inference_engine import IECore
import numpy as np
import cv2
import pprint

XML_PATH = "openvino/FP16/movenet_singlepose_thunder_3.xml"
BIN_PATH = "openvino/FP16/movenet_singlepose_thunder_3.bin"

ie = IECore()
net = ie.read_network(model=XML_PATH, weights=BIN_PATH)
input_blob = next(iter(net.input_info))
exec_net = ie.load_network(net, device_name='CPU', num_requests=1)
inference_request = exec_net.requests[0]

img = cv2.imread('test.png')
img = cv2.resize(img, (256, 256))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.asarray(img)
img = img.astype(np.float32)
img = img[np.newaxis,:,:,:]

exec_net.infer(inputs={input_blob: img})
pprint.pprint(inference_request.output_blobs)
output = inference_request.output_blobs['Identity'].buffer

pprint.pprint(output)
{'7022.0': <openvino.inference_engine.ie_api.Blob object at 0x7f4f6bebce80>,
 '7026.0': <openvino.inference_engine.ie_api.Blob object at 0x7f4f737caa80>,
 'Identity': <openvino.inference_engine.ie_api.Blob object at 0x7f4f6bebce40>}
array([[[[0.32441464, 0.69939005, 0.879445  ],
         [0.33760712, 0.71948093, 0.7585466 ],
         [0.32884556, 0.7161406 , 0.59659934],
         [0.3961212 , 0.7126963 , 0.8225935 ],
         [0.37242952, 0.7091924 , 0.80385953],
         [0.48617375, 0.65181863, 0.8533465 ],
         [0.35786432, 0.6097264 , 0.85393965],
         [0.6126379 , 0.65634876, 0.38123205],
         [0.25588638, 0.7068643 , 0.6161008 ],
         [0.62492216, 0.6062361 , 0.54878634],
         [0.18464863, 0.7830283 , 0.52940834],
         [0.6179626 , 0.50863874, 0.9211578 ],
         [0.5806241 , 0.41676882, 0.80573714],
         [0.67509186, 0.7039062 , 0.91644865],
         [0.75190806, 0.30513576, 0.8353069 ],
         [0.8821908 , 0.7038179 , 0.90322673],
         [0.88581234, 0.20652623, 0.9427294 ]]]], dtype=float32)
  • test_tflite.py
import numpy as np
import time
import tensorflow.lite as tflite
import cv2
import sys

interpreter = tflite.Interpreter(model_path='model_float32.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

image = cv2.cvtColor(cv2.imread('test.png'), cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (256, 256))
image = np.expand_dims(image, axis=0)
image = image.astype('float32')

interpreter.set_tensor(input_details[0]['index'], image)
start_time = time.time()
interpreter.invoke()
stop_time = time.time()
print("time: ", stop_time - start_time)

scores = interpreter.get_tensor(output_details[0]['index'])

import pprint
pprint.pprint(scores)
time:  0.03680682182312012
array([[[[0.32433048, 0.69931823, 0.8797673 ],
         [0.3375441 , 0.71945375, 0.7601152 ],
         [0.32876042, 0.7161203 , 0.59448147],
         [0.3960982 , 0.7126299 , 0.8223407 ],
         [0.3723629 , 0.7091111 , 0.805434  ],
         [0.48610994, 0.6517349 , 0.8513019 ],
         [0.3577021 , 0.60963666, 0.8522699 ],
         [0.6072456 , 0.6535827 , 0.43681395],
         [0.25577393, 0.7068589 , 0.6131334 ],
         [0.62486964, 0.6063015 , 0.55172515],
         [0.18469064, 0.78301877, 0.5269122 ],
         [0.61794114, 0.50867885, 0.92107165],
         [0.58055407, 0.41669008, 0.8049605 ],
         [0.67506003, 0.7038542 , 0.91628814],
         [0.7518556 , 0.30507302, 0.8341775 ],
         [0.88219225, 0.70379585, 0.9036844 ],
         [0.8857994 , 0.20647278, 0.9427308 ]]]], dtype=float32)

ONNX and OpenVINO IR and Myriad Inference Engine Blobs have been updated. I leave the rest of the verification to @geaxgx and @gespona.

When converting from tflite to ONNX, the values seem to shift slightly. I’m going to bed. Good night.

Ha ha enjoy the cigarette ! Thanks again. I don’t know how we would do if you did not exist 😃

Sorry if I sounded impatient … Ofc we’re not in hurry. Thanks a lot for all this amazing work 😃

OpenVINO FP16

array([[[[0.3242579 , 0.6867883 , 0.63874394],
         [0.334914  , 0.709219  , 0.55096024],
         [0.326399  , 0.7028601 , 0.4379415 ],
         [0.3907204 , 0.7018955 , 0.66515774],
         [0.3662796 , 0.70349765, 0.54623103],
         [0.48843265, 0.65106046, 0.577307  ],
         [0.3616504 , 0.61051846, 0.5368742 ],
         [0.62867224, 0.6713548 , 0.30035877],
         [0.23601574, 0.7138525 , 0.47763693],
         [0.6255478 , 0.6083473 , 0.4535727 ],
         [0.17515807, 0.7865279 , 0.8231634 ],
         [0.6029311 , 0.512741  , 0.67055583],
         [0.57002366, 0.41531637, 0.8211969 ],
         [0.6787125 , 0.69121534, 0.90212846],
         [0.75059325, 0.3013085 , 0.83856463],
         [0.8791874 , 0.70098466, 0.8479991 ],
         [0.88505656, 0.19993334, 0.74572027]]]], dtype=float32)

OpenVINO FP32

array([[[[0.32448542, 0.6869966 , 0.6420481 ],
         [0.33508152, 0.7094842 , 0.55379057],
         [0.32651508, 0.7030306 , 0.43798667],
         [0.39083382, 0.7020813 , 0.6667762 ],
         [0.36634612, 0.703608  , 0.5485983 ],
         [0.48858595, 0.651327  , 0.5774265 ],
         [0.3618676 , 0.6106034 , 0.53571504],
         [0.62881696, 0.6714853 , 0.29908296],
         [0.23611164, 0.7139294 , 0.47521782],
         [0.6256691 , 0.6084515 , 0.45314965],
         [0.17534205, 0.7867273 , 0.82263744],
         [0.60314196, 0.5127338 , 0.6705806 ],
         [0.57015604, 0.41531724, 0.8203208 ],
         [0.6790442 , 0.6913149 , 0.90121037],
         [0.7507467 , 0.30132818, 0.8377828 ],
         [0.87941265, 0.70109427, 0.8476199 ],
         [0.88524216, 0.19999859, 0.7457614 ]]]], dtype=float32)

This is the result of reasoning with a tflite model that I just changed the input type to Float32 before converting to ONNX.

array([[[[0.32448542, 0.68699664, 0.64204913],
         [0.33508152, 0.7094843 , 0.55379134],
         [0.32651505, 0.7030307 , 0.437988  ],
         [0.3908338 , 0.7020813 , 0.6667765 ],
         [0.36634612, 0.703608  , 0.5485987 ],
         [0.4885859 , 0.6513269 , 0.5774252 ],
         [0.36186764, 0.6106034 , 0.53571534],
         [0.62881696, 0.6714854 , 0.29908276],
         [0.23611161, 0.7139294 , 0.47521845],
         [0.62566906, 0.60845155, 0.45314837],
         [0.17534202, 0.78672737, 0.822638  ],
         [0.603142  , 0.5127338 , 0.67058015],
         [0.5701561 , 0.41531724, 0.82032025],
         [0.6790442 , 0.6913149 , 0.9012103 ],
         [0.7507467 , 0.30132815, 0.8377824 ],
         [0.87941265, 0.70109427, 0.84762025],
         [0.8852422 , 0.19999856, 0.74576056]]]], dtype=float32)

I need to do some research to see if the problem came up when I converted to ONNX. Screenshot 2021-05-18 22:09:43

I tried to reconvert lightning using a special trick.

You are the 100th issue contributor to be celebrated. 😅

I was aware of the problem you pointed out while I was converting the model while eating lunch. In fact, I have also identified a way to solve that part of the problem. However, after this problem is solved, another major problem arises that cannot be helped.

TensorFlow’s FloorDiv operation cannot be handled correctly by OpenVINO. Screenshot 2021-05-18 18:57:03

This is a known issue that only I see as a problem.