iree: RV32 code size regression from #11576 LLVM integrate on 2022-12-16
Suspected LLVM integration: #11576
Direct local repro is nontrivial due to MLIR IR format change since that time, requires an iree-import-tflite
from that timeframe. To make it easier, attaching here the already imported file.
person_detect.zip
cmake --build . --target iree-compile && tools/iree-compile --iree-hal-target-backends=llvm-cpu --iree-input-type=tosa --iree-llvm-target-abi=ilp32 --iree-llvm-target-cpu-features=+m,+a,+f,+zvl512b,+zve32x --iree-llvm-target-cpu=generic-rv32 --iree-llvm-target-triple=riscv32-pc-linux-elf --riscv-v-fixed-length-vector-lmul-max=8 --riscv-v-vector-bits-min=512 benchmark_suites/TFLite/person_detect.tflite.mlir -o /tmp/a.vmfb --iree-llvm-keep-linker-artifacts 2>&1 | grep -o '/.*\.so' | xargs size -A | grep '^\.text' | awk '{print $2}'
Prints:
at IREE commit (and git submodule update ) |
value |
---|---|
79b90d32d1723b0650b33bc5584dccb4828e5421 | 871912 |
7b4688272e40e939dc02053c7178b111a21eadd5 | 180752 |
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 28 (25 by maintainers)
Commits related to this issue
- [rv32] Expand `arith.mulsi_extended` before going to LLVM This enables `tosa.apply_scale` to be vectorized, and thus fixes a code size regression. Fixes: https://github.com/iree-org/iree/issues/1223... — committed to kuhar/iree by kuhar a year ago
- [rv32] Expand `arith.mulsi_extended` before going to LLVM (#12241) This enables `tosa.apply_scale` to be vectorized, and thus fixes a code size regression. Fixes: https://github.com/iree-org/iree... — committed to iree-org/iree by kuhar a year ago
- [rv32] Expand `arith.mulsi_extended` before going to LLVM (#12241) This enables `tosa.apply_scale` to be vectorized, and thus fixes a code size regression. Fixes: https://github.com/iree-org/iree... — committed to qedawkins/iree by kuhar a year ago
- [rv32] Expand `arith.mulsi_extended` before going to LLVM (#12241) This enables `tosa.apply_scale` to be vectorized, and thus fixes a code size regression. Fixes: https://github.com/iree-org/iree... — committed to iree-org/iree by kuhar a year ago
- [rv32] Expand `arith.mulsi_extended` before going to LLVM (#12241) This enables `tosa.apply_scale` to be vectorized, and thus fixes a code size regression. Fixes: https://github.com/iree-org/iree... — committed to plaidml/iree by kuhar a year ago
Bisect results coming soon (~ 5 bisection steps remaining)
I confirm that #12241 fixes the testcase here. Using the test in this Issue description (and using current
iree-import-tflite
to generateperson_detect.tflite.mlir
):main
Thanks a lot @kuhar for the effective fix!
If possible, I would really appreciate this to be included in
main
, as our project pulls IREE’s release candidate foriree-compile
instead of building from source. I am okay with hiding this behind a compile flag if we don’t want the behavior to be default.The original IR generates two mul ops: from the first one, only the high part is used, from the second one, only the low part is used. The backend is able to match these patterns and generate the corresponding hi/lo muls included in zve32x. If we generate a single mul and extract the hi/low parts from it, the backend will try to generate a single mul and will end up scalarizing it.