arcade: Problems with Azure Devops Reporter

  • This issue is blocking
  • This issue is causing unreasonable pain

As I’ve written up in https://github.com/dotnet/core-eng/issues/13026, we introduced a threading bug in https://github.com/dotnet/arcade/pull/7310/files that can crash the reporter. It may be difficult to intentionally reproduce this problem since it kind of relies on actually doing the reporting, but we can inspect the code or just revert the lock part of the change and instead just do the “don’t let it pass if it doesn’t finish” part of the change.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

@ericstj I may take some stabs at this one tomorrow but it’s much harder than simply adding retries to AzDO calls.

Just had a great conversation with @safern and he made a very keen insight. Specifically, if we just update the arcade reporter behavior to just return the real, not lying exit code when Azure Devops reporter fails, it’s easy to implement and gives us the best of both things; when reporting fails, the work item still can pass, and when the work item fails and reporting fails, we still fail the check, e.g.:

if (reporting fails) 
    {work item returns the actual test case's exit code, i.e. 1} 
else 
   {Work item returns the normal the coerced exit code (0 if 1, the actual exit code otherwise)}

The error stack trace in https://github.com/dotnet/core-eng/issues/13026 doesn’t have anything to do with the lock I added. This error is happening because the interpreter can’t get the lock for “stdout” and none of my change affects locking of stdout. This error is just a very rare race condition where one of the worker threads (Thread 0x000070000f354000 in that stack trace) happens to be inside a _print call at the exact time that the process is trying to exit.

Looking at what it was trying to print. One of the worker threads never got a chance to start, before the work was all completed and the process exited.