tt-metal: Unable to handle large tensors
Unable to move large tensors to device. In other cases if the large tensors get created from operations we are unable to move them to host.
def test_large_slicing(device):
torch_a = torch.rand((1, 1, 42, 250880), dtype=torch.bfloat16)
torch_output = torch_a[:, :, -1, :]
a = ttnn.from_torch(torch_a)
a = ttnn.to_device(a, device)
tt_output = a[:, :, -1, :]
tt_output = ttnn.from_device(tt_output)
tt_output = ttnn.to_torch(tt_output)
assert_with_pcc(torch_output, tt_output, 0.9999)
Large tensor moving to host with ttl_tensor.cpu causes…
Exception has occurred: RuntimeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
TT_ASSERT @ tt_metal/impl/dispatch/command_queue.cpp:317: padded_page_size <= consumer_cb_size
info:
Page is too large to fit in consumer buffer
backtrace:
--- void tt::assert::tt_assert<char [44]>(char const*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, char const*, char const (&) [44])
--- tt::tt_metal::EnqueueReadBufferCommand::assemble_device_command(unsigned int)
--- tt::tt_metal::EnqueueReadBufferCommand::process()
--- tt::tt_metal::CommandQueue::enqueue_command(tt::tt_metal::Command&, bool)
--- tt::tt_metal::CommandQueue::enqueue_read_buffer(tt::tt_metal::Buffer&, std::vector<unsigned int, std::allocator<unsigned int> >&, bool)
--- tt::tt_metal::EnqueueReadBuffer(tt::tt_metal::CommandQueue&, tt::tt_metal::Buffer&, std::vector<unsigned int, std::allocator<unsigned int> >&, bool)
--- std::vector<bfloat16, std::allocator<bfloat16> > tt::tt_metal::tensor_impl::read_data_from_device<bfloat16>(tt::tt_metal::Tensor const&, unsigned int)
--- /home/ubuntu/git/tt-metal/tt_eager/tt_lib/_C.so(+0x925f65) [0x7f1665e60f65]
--- std::_Function_handler<tt::tt_metal::Tensor (tt::tt_metal::Tensor const&), tt::tt_metal::Tensor (*)(tt::tt_metal::Tensor const&)>::_M_invoke(std::_Any_data const&, tt::tt_metal::Tensor const&)
--- std::function<tt::tt_metal::Tensor (tt::tt_metal::Tensor const&)>::operator()(tt::tt_metal::Tensor const&) const
--- tt::tt_metal::tensor_impl::to_host_wrapper(tt::tt_metal::Tensor const&)
--- tt::tt_metal::Tensor::cpu() const
About this issue
- Original URL
- State: open
- Created 7 months ago
- Comments: 17 (11 by maintainers)
@davorchap @abhullar-tt actually supports this in her completion queue PR.