legion: non-deterministic test failures

We ran HTR with the shardrefine branch at the request of @lightsighter. Our regression tests are randomly failing their tolerance checks in debug mode. I don’t have much useful info to share at the moment, but @mariodirenzo will be doing some analysis with legion spy to get some clues.

@elliottslaughter, could you please add this to #1032 ?

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Reactions: 1
  • Comments: 19 (15 by maintainers)

Most upvoted comments

I did fix at least one bug with execution fences in that second commit. I don’t see why the first commit would have caused any problems though as it was exclusively about the padded instances (usually their failure mode would be an illegal memory access). It sounds though like you’ve done some more rigorous testing though now, which at least should improve our confidence that we fixed whatever the error is.