runtime: Intermittent: Empty Process stdout

Story

When running user-defined console-only processes using System.Diagnostics.Process simultaneously in parallel threads and redirecting their outputs, in some cases you can get empty or non-full output data (from redirected STDOUTs).

Platforms

All platforms (see quoted file in https://github.com/dotnet/runtime/issues/27128#issuecomment-583153102 and other platform-specific implementations).

Reproduction code

The error is intermittent, to reproduce it quickly, you need to run multiple instances of the reproduction program on a computer with multi-core processor (4+ threads recommended). Some “tests” would pass, but some of them would fail.

You can specify the path in process.StartInfo.FileName to any console program, that exits automatically. In the given example, I use the standard Windows console program ipconfig.exe.

using System;
using System.IO;
using System.Text;
using System.Threading;
using System.Diagnostics;

class EmptyStdOutRepro
{
    
    public static void Main()
    {
        new EmptyStdOutRepro().Run();
    }
    
    public void Run()
    {
        
        while (true)
            Exec();
        
    }
    
    private void Exec()
    {
        
        int lineCount = 0;
        string output = "";
        
        Process process = new Process();
        
        process.StartInfo.FileName = "ipconfig.exe";
        
        process.StartInfo.UseShellExecute = false;
        process.StartInfo.RedirectStandardOutput = true;
        process.StartInfo.CreateNoWindow = true;
        process.StartInfo.ErrorDialog = false;
        
        process.OutputDataReceived += new DataReceivedEventHandler((sender, e) =>
        {
            
            if (!String.IsNullOrEmpty(e.Data))
            {
                lineCount++;
                output += ("\n[" + lineCount + "]: " + e.Data);
            }
            
        });

        process.Start();

        process.BeginOutputReadLine();
        process.WaitForExit(1000); // Unpredictable behavior

        if (output != "")
            Console.WriteLine("PASSED!" + output.Length);
        else
            Console.WriteLine("FAILED!" + output.Length);

        process.WaitForExit();
        process.Close();
        
    }
    
}

Initial reproduction, needs minimization

https://github.com/thesirkadirov/simplepm-server/tree/master/SProgramRunner

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Comments: 33 (26 by maintainers)

Most upvoted comments

I suspect you might be seeing a known problem where there is a race condition between Process.Exited and Process.OutputDataReceived events.

See the remarks here:

https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.process.waitforexit?redirectedfrom=MSDN&view=netcore-3.1#System_Diagnostics_Process_WaitForExit_System_Int32

When standard output has been redirected to asynchronous event handlers, it is possible that output processing will not have completed when this method returns. To ensure that asynchronous event handling has been completed, call the WaitForExit() overload that takes no parameter after receiving a true from this overload.

If you are calling Process.WaitForExit(int timeout), you actually have to call it again without the timeout to ensure that stdout has been flushed e.g. change this:

return Process.WaitForExit(timeout);

to this:

bool result = Process.WaitForExit(timeout);
if (result)
{
    Process.WaitForExit();
}
return result;

This has to be a really common gotcha.

I’ve just found it in https://github.com/daveaglick/Buildalyzer/issues/125.

It looks like we inherited this from long ago. If there was a non-breaking way to improve things, great but I’m not sure there is (?)

I would wager that if we searched GitHub for all uses of WaitForExit(int timeout) combined with reading standard-out, we would find this bug in > 50% of the cases. I know that nuget.client had to fix it some time ago. https://github.com/NuGet/Home/issues/1902.

I also suspect that the frequency of observed stability problems will increase as core-counts increase and a lot of users won’t track it down to this. I don’t know what measures you use to decide on changes but to me this API is a guaranteed bug in user’s code and must be fixed.

I appreciate the difficulty changing something so fundamental so I would suggest:

  1. Mark WaitForExit(int timeout) as deprecated with a link to explanation and resolution in the warning. No need to delete it.

  2. Create a new overload WaitForExit(int timeout, ProcessStreamControl processStreamControl) where the second argument is an enum to choose either the current behaviour or the expected behaviour e.g.

public enum ProcessStreamControl
{
    WaitForStreamsToFlush,
    ExitImmediately
}

Those names suck and yes it’s ugly but it’s clear and won’t break anything.

NuGet is being asked to do this double wait trick. If this is a problem, would prefer to have runtime fix, rather than put workarounds everywhere. Guidance? @SimonCropp opened: https://github.com/NuGet/Home/issues/10189

would having a new async process api fix this problem??

again @adamsitnik @danmosemsft given the extra info. should this be moved from “question” to “bug” label, and “needs more info” label be removed?

@duncanawoods My code with additional Process.WaitForExit() works well on a small amount of tests. Need more time to verify whether it isn’t only a partial solution, though.

@danmosemsft It is not a popular problem, the solution of which will bring more trouble than help. In my opinion, the best idea is to add some notes to the project’s documentation.

I don’t think that .NET Core platform is designed for this case. Third-party library will be a better solution for working deeply with processes.

I’m not familiar with this problem but since the parameterless version is just calling the other one with an infinite timeout. I’m guessing which you choose doesn’t matter ie the text could probably be better worded as

To ensure that asynchronous event handling has been completed, call WaitForExit() after receiving a true from this overload.

OK, I’ll stay in touch.

I’ll close this meantime, feel free to reopen if you determine it is .NET Core.