tt-metal: Rebase and Merge: FD pipelines hanging on N300 after python sweep tests
With all the tunneling and FD changes, we are seeing a deterministic CI hang, looks like during setting up remote chip. Only happens on N300 machines on CI, cannot reproduce locally, at least on a t3000. If I comment out ./tests/scripts/run_python_sweep_tests.sh
from
fast-dispatch-build-and-unit-tests.yaml, tests pass.
Example: https://github.com/tenstorrent-metal/tt-metal/actions/runs/8127882306/job/22213201896
About this issue
- Original URL
- State: closed
- Created 4 months ago
- Comments: 19 (13 by maintainers)
Commits related to this issue
- #5972: Temp workaround for pytest sweep -> L/R gtest hang by inserting reset between pytests and c++ tests — committed to tenstorrent/tt-metal by abhullar-tt 4 months ago
- #5972: Temp workaround for pytest sweep -> L/R gtest hang by inserting reset between pytests and c++ tests — committed to tenstorrent/tt-metal by abhullar-tt 4 months ago
- Revert "#5972: Temp workaround for pytest sweep -> L/R gtest hang by inserting reset between pytests and c++ tests" This reverts commit 51d1bab48b0125f02317fc5e383ce5e27f970b71. — committed to tenstorrent/tt-metal by aliuTT 4 months ago
- #5972: Temp workaround for pytest sweep -> L/R gtest hang by inserting reset between pytests and c++ tests — committed to tenstorrent/tt-metal by abhullar-tt 4 months ago
- Revert "#5972: Temp workaround for pytest sweep -> L/R gtest hang by inserting reset between pytests and c++ tests" This reverts commit 51d1bab48b0125f02317fc5e383ce5e27f970b71. — committed to tenstorrent/tt-metal by aliuTT 4 months ago
- #5972: re-enable R chip push tests — committed to tenstorrent/tt-metal by aliuTT 4 months ago
- #5972: re-enable R chip push tests — committed to tenstorrent/tt-metal by aliuTT 4 months ago
To enable the tunnelling PR (#5986) we are going to temporarily move the R chip gtests into the nightly test suite and only run L chip tests on post commit
Once we get syseng drop for tt-smi fix + 800 MHz (targeted for 03/06) we will re-test the R chips on VMs
@TT-billteng and @tt-rkim is N300 clean now, can we close?