legion: Hang at start up on multinode runs
If HTR is compiled on the latest version of master, it hangs at startup when executed on multiple nodes.
I think that the top-level task does not even start its execution.
The backtraces obtained on a two-node execution are contained in the attached files
bt_0.log
bt_1.log
The backtraces are produced on sapling but the problem reproduces on every system that I have tried so far.
@elliottslaughter, can you please add this issue to #1032?
About this issue
- Original URL
- State: closed
- Created 4 months ago
- Comments: 23 (6 by maintainers)
cudaipc-fix fixes the issue. Thanks for working on it