nunit-console: Runtime.Remoting.RemotingException in NUnit.Engine.Runners.ProcessRunner.Dispose

Hi,

I’m trying to migrate my windows build server to linux and now have crashes in my unit tests:

khadmin@khbuild:~/myagent/_work/2/s$ mono  ./thirdparty/tools/NUnit/nunit3-console.exe --labels=All --noheader ./buildartifacts/KH.Tests.dll /result=./buildartifacts/EventsCursorTestsResults.xml
Runtime Environment
   OS Version: Linux 4.4.0.62
  CLR Version: 4.0.30319.42000

Test Files
    ./buildartifacts/KH.Tests.dll

=> KH..Tests.when_something.test_xx
System.Runtime.Remoting.RemotingException: Tcp transport error. ---> System.Runtime.Remoting.RemotingException: Connection closed
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.StreamRead (System.IO.Stream networkStream, System.Byte[] buffer, System.Int32 count) [0x00014] in <d849815271d74866b18b5a29e4184d0c>:0
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.ReceiveMessageStatus (System.IO.Stream networkStream, System.Byte[] buffer) [0x00000] in <d849815271d74866b18b5a29e4184d0c>:0
   --- End of inner exception stack trace ---


Server stack trace:
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.ReceiveMessageStatus (System.IO.Stream networkStream, System.Byte[] buffer) [0x0001a] in <d849815271d74866b18b5a29e4184d0c>:0
  at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage (System.Runtime.Remoting.Messaging.IMessage msg, System.Runtime.Remoting.Channels.ITransportHeaders requestHeaders, System.IO.Stream requestStream, System.Runtime.Remoting.Channels.ITransportHeaders& responseHeaders, System.IO.Stream& responseStream) [0x00061] in <d849815271d74866b18b5a29e4184d0c>:0
  at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage (System.Runtime.Remoting.Messaging.IMessage msg) [0x0006c] in <d849815271d74866b18b5a29e4184d0c>:0

Exception rethrown at [0]:
  at (wrapper managed-to-native) System.Object:__icall_wrapper_mono_remoting_wrapper (intptr,intptr)
  at (wrapper remoting-invoke) NUnit.Engine.Agents.RemoteTestAgent:Stop ()
  at NUnit.Engine.Runners.ProcessRunner.Dispose (System.Boolean disposing) [0x0004b] in <7b0953727751470ab10f0e0b547b85ae>:0

May be this is known issue?

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 24 (6 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve spent some time and now have more information about cause of this issue.

There is a race between ProcessRunner.Dispose(bool) which calls at some point something like:

try { 
   _agent.Stop();       // This fires up manual-reset-event
   _agent = null;        // This effectively disposes something related to .net-remoting-tcp subsystem 
                                  // which probably wants to gracefully close tcp connection  
} catch  (Exception e) {
    if ( unloadError == null) { throw; } // This is exception we see in my report. 
}

And nunit-agents/Program.cs with code like:

...
                try
                {
                    if (Agent.Start())
                        WaitForStop();
                    else
                        log.Error("Failed to start RemoteTestAgent");
                }
                catch (Exception ex)
                {
                    log.Error("Exception in RemoteTestAgent", ex);
                }

                //log.Info("Unregistering Channel");
                try
                {
                    ChannelServices.UnregisterChannel(Channel);
                }
                catch (Exception ex)
                {
                    log.Error("ChannelServices.UnregisterChannel threw an exception", ex);
                }

So, my guess is that problem is that remote runner process receives stop signal and immiditely closes tcp connection while Host process tries gracefully finalize tcp connection which is already closed at that moment.

Just to work around, I can either ignore that exception in ProcessRunner.Dispose() or put pause in nunit-agent/Program.cs and both should help.

Sure, I could be wrong with all that, but quick investigation ended up with such conclusion.

In general, I send people to an appropriate dev build on MyGet. Those are already reviewed and merged, whereas appveyor builds may or may not be. The MyGet builds are like our continuous beta release toward the next version.

@jnm2 I don’t think my changes will help for @craigfowler. You cannot install on Linux, so it won’t find another engine and my repro happens all the time.

I am still seeing this in NUnit.ConsoleRunner v3.7.0 (from NuGet) using Linux/Mono. It’s intermittent and doesn’t crash every time.

NUnit Console Runner 3.7.0 
Copyright (c) 2017 Charlie Poole, Rob Prouse
Runtime Environment
   OS Version: Linux 4.11.6.41106 
  CLR Version: 4.0.30319.42000
Test Files
    Tests/CSF.Screenplay.Web.Tests/bin/Debug/CSF.Screenplay.Web.Tests.dll
System.Runtime.Remoting.RemotingException: Tcp transport error. ---> System.Runtime.Remoting.RemotingException: Connection closed
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.StreamRead (System.IO.Stream networkStream, System.Byte[] buffer, System.Int32 count) [0x00011] in <ffd0b3fb04044b19b8235de76cf332b2>:0 
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.ReceiveMessageStatus (System.IO.Stream networkStream, System.Byte[] buffer) [0x00000] in <ffd0b3fb04044b19b8235de76cf332b2>:0 
   --- End of inner exception stack trace ---
Server stack trace: 
  at System.Runtime.Remoting.Channels.Tcp.TcpMessageIO.ReceiveMessageStatus (System.IO.Stream networkStream, System.Byte[] buffer) [0x00017] in <ffd0b3fb04044b19b8235de76cf332b2>:0 
  at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage (System.Runtime.Remoting.Messaging.IMessage msg, System.Runtime.Remoting.Channels.ITransportHeaders requestHeaders, System.IO.Stream requestStream, System.Runtime.Remoting.Channels.ITransportHeaders& responseHeaders, System.IO.Stream& responseStream) [0x0005e] in <ffd0b3fb04044b19b8235de76cf332b2>:0 
  at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage (System.Runtime.Remoting.Messaging.IMessage msg) [0x00066] in <ffd0b3fb04044b19b8235de76cf332b2>:0 
Exception rethrown at [0]: 
  at (wrapper managed-to-native) System.Object:__icall_wrapper_mono_remoting_wrapper (intptr,intptr)
  at (wrapper remoting-invoke) NUnit.Engine.Agents.RemoteTestAgent:Stop ()
  at NUnit.Engine.Runners.ProcessRunner.Dispose (System.Boolean disposing) [0x00086] in <1e8ad6af4c6f4686ad7e5f9e67020b3b>:0 
  at NUnit.Engine.Runners.AbstractTestRunner.Dispose () [0x00000] in <1e8ad6af4c6f4686ad7e5f9e67020b3b>:0 
  at NUnit.Engine.Runners.MasterTestRunner.Dispose (System.Boolean disposing) [0x00013] in <1e8ad6af4c6f4686ad7e5f9e67020b3b>:0 
  at NUnit.Engine.Runners.MasterTestRunner.Dispose () [0x00000] in <1e8ad6af4c6f4686ad7e5f9e67020b3b>:0 
  at NUnit.ConsoleRunner.ConsoleRunner.RunTests (NUnit.Engine.TestPackage package, NUnit.Engine.TestFilter filter) [0x0010e] in <5d13e9f4d03e4da1b4779cb1d61b9b3d>:0 
  at NUnit.ConsoleRunner.ConsoleRunner.Execute () [0x000b6] in <5d13e9f4d03e4da1b4779cb1d61b9b3d>:0 
  at NUnit.ConsoleRunner.Program.Main (System.String[] args) [0x001bf] in <5d13e9f4d03e4da1b4779cb1d61b9b3d>:0 

I have a log of the whole build used to reproduce this available on Travis. In that log, the crash above is recorded starting on line 2784. I think that the rest of the log should contain just about everything else about the build environment you might possibly want to know.

@jnm2 I can confirm that putting delay fixes (works around) this particular problem. That was what I actually did for my build server. Thanks for better fix!

This looks to be very much related to issue https://github.com/nunit/nunit/issues/1834 where we get a similar stack trace about a closed connection when disposing. Obviously this is slightly different since you’re running mono and not using the explore flag, but I’d hazard a guess that they’re the same underlying bug.