vision: [JIT] Not supported for maskrcnn_resnet50_fpn

I am trying to accelerate the maskrcnn_resnet50_fpn pretrained model using JIT tracing provided by pytorch. It appears that some operations present in this model are not supported by pytorch JIT.

Are these models supposed to have JIT support officially? If not, would you be able to provide advice for a workaround?

To replicate, running:

import torch
import torchvision
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model.eval()
traced_net = torch.jit.trace(model, torch.rand(1, 3,800, 800))

produces

RuntimeError: log2_vml_cpu not implemented for 'Long

Thank you.

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 7
Comments: 59 (27 by maintainers)

Most upvoted comments

@cted18 Yes i’ll be working on adding OrderedDict to support fcn_resnet101. I think together with op support added in https://github.com/pytorch/vision/pull/1267 it shouldn’t be too hard to support in script.

eellison on Aug 29, 2019

@stereomatchingkiss ONNX and JIT support for Mask R-CNN in torchvision has been merged into master, and is available if you compile from source.

fmassa on Dec 2, 2019

@stereomatchingkiss Yes, it’s almost ready, just need to fix some unrelated ONNX issues and it will be merged this week

fmassa on Nov 25, 2019

It’s a limitation of torchscript that we can’t have different return types depending on the self.training, so we always return both the losses and the detections, although only one of them will be activated.

@fmassa Thank you very much for your explanation. It gave me the intuition that I needed!

WaterKnight1998 on Jun 5, 2020

@cted18 I believe the solution you are looking for can be found in https://github.com/pytorch/vision/issues/1002#issuecomment-562915463 and https://github.com/pytorch/vision/pull/1407#issuecomment-563048240

If you are still facing issues, can you open a new issue with a full reproducible example of the problem?

fmassa on Jan 8, 2020

@stereomatchingkiss

Any good news of using the models by c++ api?

We will be improving the experience of using the torchvision models with the C++ API over time. We have just enabled support for Mask R-CNN models to be torchscripted, and will be refining the C++ export over time

fmassa on Dec 10, 2019

@soumith , @fmassa I change the the code to torch.tensor(approx_scale, dtype=torch.float32) in torchvision/ops/poolers.py as soumith said. It worked for that error. But there came another error. I think it’s about the TorchScript is not supporting maskrcnn’s output format here are the logging: RuntimeError: Only tensors or tuples of tensors can be output from traced functions (getNestedOutputTrace at /opt/conda/conda-bld/pytorch_1556653099582/work/torch/csrc/jit/tracer.cpp:200) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7bb5b1adc5 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libc10.so) frame #1: torch::jit::tracer::getNestedOutputTrace(std::shared_ptr<torch::jit::tracer::TracingState> const&, c10::IValue const&) + 0x23e (0x7f7bb39d5cee in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch.so.1) frame #2: torch::jit::tracer::exit(std::vector<c10::IValue, std::allocator<c10::IValue> > const&) + 0x2f (0x7f7bb39d5dbf in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch.so.1) frame #3: <unknown function> + 0x447ab3 (0x7f7be4e3eab3 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #4: <unknown function> + 0x45a8b4 (0x7f7be4e518b4 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #5: <unknown function> + 0x12ce4a (0x7f7be4b23e4a in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) <omitting python frames> frame #20: __libc_start_main + 0xe7 (0x7f7bf41f0b97 in /lib/x86_64-linux-gnu/libc.so.6) And it seems too hard for me to work around it, torchvision.models.detection is such a great work, it make my code a lot easier. hope this problem can be fixed soon : )

XushengLee on Jul 30, 2019

this actually looks like a bug in scale = 2 ** torch.tensor(approx_scale).log2().round().item() in torchvision/ops/poolers.py.

If approx_scale here is an exact integer, the tensor will be a LongTensor, which is unexpected.

That should be changed to torch.tensor(approx_scale, dtype=torch.float32)

soumith on Jun 6, 2019