tvm: [Performance regression] Revamp IntSet #3272 causing GluonCV SSD performance issue

https://github.com/dmlc/tvm/pull/3272 is causing the similar issue used to happen in https://github.com/dmlc/tvm/issues/3097

The operator fused_strided_slice_greater_cast_strided_slice_zeros_like_add_add_add_add_add_ad_11203150218747419416_ is much slower due to mod operation not simplified:

placeholder[((((ax0.ax1.fused*4) + ax2) + -466036) % 16)]

While before this PR it is:

placeholder[((((ax0.ax1.fused*4) + ax2) + -4) % 16)]

@tqchen @wweic

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 19 (19 by maintainers)

Most upvoted comments

@kevinthesun please check again now that the the integer simplification infra lands

Yes, introducing floordiv/mod might improve the perf further, but will need a few more PRs to change the division mode to take benefit of that. I would encourage us to separate the issue. If you can still isolate things that can be improved, we can dig further here