vision: ONNX CI workflow is broken

Since the 5th of May our CI workflow for ONNX is broken (commit 970ba3555794d163daca0ab95240d21e3035c304). Looking at the warnings emitted by the failing tests

WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[...]
AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 4]) != torch.Size([0, 4]).

Two models are affected faster_rcnn and mask_rcnn. To reproduce run:

pytest test/test_onnx.py -k "test_faster_rcnn" 
pytest test/test_onnx.py -k "test_mask_rcnn" 

I believe a recent patch to primtorch might be the offender here. cc @neginraoof @seemethere @mruberry

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 2
  • Comments: 17 (10 by maintainers)

Commits related to this issue

Most upvoted comments

After some painful bisection, I finally found the real offender: pytorch/pytorch#73284. After seeing that the PR title contains the phrase ONNX and our ONNX tests are failing, I have no idea how I missed that when looking at the PRs 🤦

Can confirm @BowenBao’s assessment. You can verify yourself, by looking at the nightly branch. The cutoff for today (2022-05-13) was pytorch/pytorch@65f71c0cbeb080c13e927d37b0d23d39bac6f092. Taking that knowledge to the master branch, we can verify that pytorch/pytorch@a812c4cd96d94d51627d2af290ae87de34169ec0 was three commits late.

It will make its way into tomorrows nightly. I’ll retest and close this if the fix worked.

We’ll know when the fresh torch nightly drops, which is around UTC+0 10:00. I’ll report back.

Fix has been merged in pytorch master. @datumbox please let us know if this fixes torchvision CI.

My bad, I linked the wrong PR 🤦 Sorry for the noise. ~It should have been pytorch/pytorch#76875~ See https://github.com/pytorch/vision/issues/5971#issuecomment-1124310367

It does seem like primTorch would be to blame because we also use the “prim” or “prims” prefix, but we don’t have a prim::Constant or any C++ code