tt-metal: Untilize sweep test does not work for multicore (last shape)

test: tests/tt_eager/python_api_testing/sweep_tests/pytests/tt_dnn/test_untilize.py::test_untilize_test[untilize_args0-input_shapes2]

This needs to be fixed for multicore, and then change the default untilize to use multicore once this is fixed.

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 15 (11 by maintainers)

Most upvoted comments

On Grayskull, this test cant run cause it runs out of L1. Couple of things I have noticed with untilize:

  • Single-core and multi-core both take full width per core. Single-core version properly figures out how much of the width to buffer, while multi-core naively double buffers on the full width. That’s causing this test to run out of memory.
  • If each core gets the full width and at least tile HEIGHT, I would expect multi-core to just work as well.

What’s needed to enable multi-core by default is:

  • defining some shapes to test before we call it working. So far, I haven’t seen a case fail on Grayskull multi-core.
  • clean up multi_core and pack_untilize flags. We should just use them both as True and fix what’s not working. It seems like it’s just a matter of getting pack_untilize to work with longer widths.