iree: Transform preprocessing spec is unusably slow on real-world inputs
Simplest transform spec to illustrate the issue:
// spec.mlir
module attributes { transform.with_named_sequence } {
transform.named_sequence @match_generic(%entry: !transform.any_op {transform.readonly}) -> !transform.any_op {
transform.match.operation_name %entry ["linalg.generic"] : !transform.any_op
transform.yield %entry : !transform.any_op
}
transform.named_sequence @__transform_main(%variant_op: !transform.any_op {transform.consumed}) {
%generic = transform.collect_matching @match_generic in %variant_op
: (!transform.any_op) -> !transform.any_op
transform.print %generic {name = "Generic"} : !transform.any_op
// Transform the matched generic.
transform.yield
}
transform.named_sequence @__preprocessing_main(%variant_op: !transform.any_op {transform.readonly}) {
transform.yield
}
} // module
Repro:
wget https://sharkpublic.blob.core.windows.net/sharkpublic/ian/stable_diffusion_xl_base_1_0_64_1024x1024_fp16_unet_linalg_nithin.mlir
tools/iree-compile --iree-hal-target-backends=rocm --iree-rocm-target-chip=gfx942 stable_diffusion_xl_base_1_0_64_1024x1024_fp16_unet_linalg_nithin.mlir --iree-preprocessing-transform-spec-filename=spec.mlir
This input mlir has <50 kLOC and contains 8583 linalg.generic
ops. With the transform script above, I can only match ~560 linag.generic
s in 30 seconds. This doesn’t make it a viable solution for real-world inputs.
Are there some knobs I could tweak to make it go faster? Alternatively, maybe we could do some pre-filtering at the level of C++, e.g., wrap the transform dialect interpreter in a pass pipeline (that could be specified as a CLI pipeline string).
cc: @Groverkss @harsh-nod @ftynse @matthias-springer @maerhart
About this issue
- Original URL
- State: closed
- Created 3 months ago
- Comments: 21 (14 by maintainers)
Commits related to this issue
- [TD][Preprocessing] Speed up `match.cast_compatible_dag_from_root` Do not attach the whole ops/regions to the diagostics, as this causes extreme slowness on larger input files. Issue: https://github... — committed to kuhar/iree by kuhar 3 months ago
- [TD][Preprocessing] Speed up `match.cast_compatible_dag_from_root` (#16914) Do not attach the whole ops/regions to the diagnostics, as this causes extreme slowness on larger input files. This cha... — committed to iree-org/iree by kuhar 3 months ago
- [mlir][TD] Allow op printing flags as `transform.print` attrs Introduce 3 new optional attributes to the `transform.print` ops: * `assume_verified` * `use_local_scope` * `skip_regions` The primary m... — committed to kuhar/llvm-project by kuhar 3 months ago
- [mlir][TD] Allow op printing flags as `transform.print` attrs Introduce 3 new optional attributes to the `transform.print` ops: * `assume_verified` * `use_local_scope` * `skip_regions` The primary m... — committed to kuhar/llvm-project by kuhar 3 months ago
- [mlir][TD] Allow op printing flags as `transform.print` attrs (#86846) Introduce 3 new optional attributes to the `transform.print` ops: * `assume_verified` * `use_local_scope` * `skip_regions` ... — committed to llvm/llvm-project by kuhar 3 months ago
Ah I didn’t see that you made them non-default options in your PR - then yeah, who cares #yolo 😛 (my only concern is defaults that lead to users hitting errors that then we have to triage/debug - if a user opts into something that crashes that’s on them - like when I set -verify=false I’d expect someone to tell me to not do that if I filed a bug about a crash 😃
gotcha - you’ll want to capture without the print then, as it’s likely skewing things too far to see what the actual cause is