iree: Fused transposed elementwise ops in dispatch region causing extra shared memory allocation
What happened?
On trying to pass the IR through iree-run-module
, I get the following error :-
C:\A\iree\runtime\src\iree\hal\drivers\vulkan\native_executable.cc:157: UNAVAILABLE; VK_ERROR_INITIALIZATION_FAILED; while invoking native function hal.executable.create; while calling import;
[ 1] native hal.executable.create:0 -
[ 0] bytecode module.__init:268 .\dispatch\module_forward_dispatch_28_vulkan_spirv_fb.mlir:2:3
This takes place for --iree-vulkan-target-triple=rdna2-unknown-windows
Steps to reproduce your issue
Download module_forward_dispatch_28_vulkan_spirv_fb.mlir.
Step 1.
.\iree-compile.exe module_forward_dispatch_28_vulkan_spirv_fb.mlir --iree-hal-target-backends=vulkan --iree-vulkan-target-triple=rdna2-unknown-windows -o test_28vmfb
Step 2.
.iree-run-module.exe --module=test_28.vmfb --device=vulkan --function=forward_dispatch_28_matmul_4096x512x512
What component(s) does this issue relate to?
Compiler, Runtime
Version information
No response
Additional context
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 50 (24 by maintainers)
Commits related to this issue
- [spirv] Support transposed elementwise ops in SIMT pipeline (#13823) This just needs to optimize vector transfer ops after vectorization and before folding memref aliases. At that time we still have... — committed to iree-org/iree by antiagainst a year ago
- [spirv] Support transposed elementwise ops in SIMT pipeline (#13823) This just needs to optimize vector transfer ops after vectorization and before folding memref aliases. At that time we still have... — committed to NatashaKnk/iree by antiagainst a year ago
- [spirv] Support transposed elementwise ops in SIMT pipeline (#13823) This just needs to optimize vector transfer ops after vectorization and before folding memref aliases. At that time we still have... — committed to plaidml/iree by antiagainst a year ago
it seems like the tile sizes chosen (which affects the shared memory usage) is not account for shared memory usage⦠So this is a backend issue.