nunit-console: SocketException thrown during console run

@HaomingFu commented on Fri Jul 07 2017

I’m using NUnit3.5 console runner. I use nunit console to run my unit tests. There is no deterministic repro but it fails very often with error code -100.

nunit3-console.exe UnitTest.Client.dll --where "cat==MyTests" --output=RBSUnitTest_General_OutputFile.txt --result=RBSUnitTest_General_ResultFile.xml

I see this in the nunit console cmd output:

Server stack trace: 
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.Runtime.Remoting.Channels.SocketStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Runtime.Remoting.Channels.SocketHandler.ReadFromSocket(Byte[] buffer, Int32 offset, Int32 count)
   at System.Runtime.Remoting.Channels.SocketHandler.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.Runtime.Remoting.Channels.Tcp.TcpFixedLengthReadingStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.BinaryReader.ReadBytes(Int32 count)
   at System.Runtime.Serialization.Formatters.Binary.SerializationHeaderRecord.Read(__BinaryParser input)
   at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadSerializationHeaderRecord()
   at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
   at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
   at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
   at System.Runtime.Remoting.Channels.CoreChannel.DeserializeBinaryResponseMessage(Stream inputStream, IMethodCallMessage reqMsg, Boolean bStrictBinding)
   at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)

Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at NUnit.Engine.ITestAgent.Stop()
   at NUnit.Engine.Runners.ProcessRunner.Dispose(Boolean disposing)
   at NUnit.Engine.Runners.AbstractTestRunner.Dispose()
   at NUnit.Engine.Runners.MasterTestRunner.Dispose(Boolean disposing)
   at NUnit.Engine.Runners.MasterTestRunner.Dispose()
   at NUnit.ConsoleRunner.ConsoleRunner.RunTests(TestPackage package, TestFilter filter)
   at NUnit.ConsoleRunner.Program.Main(String[] args)

Any help would be appreciated.


@ChrisMaddock commented on Fri Jul 07 2017

I’d expect this is the same issue as this, which will be fixed in NUnit Console 3.7.

https://github.com/nunit/nunit-console/issues/225

Try the console from the latest build of master at the link below - and see if that solves the problems you’re seeing? https://ci.appveyor.com/project/CharliePoole/nunit-console/build/3.7.0-dev-03691/artifacts


@ChrisMaddock commented on Fri Jul 07 2017

I’m also going to move this issue over to the NUnit Console repo.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 105 (33 by maintainers)

Most upvoted comments

Sounds sensible to me. All for removing unnecessary complexity. 🙂

All,

We believe this issue will have been fixed by @BlythMeister’s temporary fix in #370.

Until the next console release, please try the master build, version 3.9.0-dev-03932 or later. You can find this on out MyGet Feed.


If you still experience this issue with the above build of master…

  1. Run your tests with the --inprocess flag - see if this exposes a more useful exception or error message.

  2. Add the --trace=Debug flag to your command line. When you next see the exception, examine the logs for any clues.

Please then open a new issue. The SocketException is often caused by an nunit-agent crash, which can occur in a number of different scenarios. We believe many of these have been fixed in NUnit.Console 3.8, and the fix in #372 - however, they may still be edge cases we have yet to tackle. Please report these if you come across any!

I’m hopeful it will fix the race condition shutdown issue that you diagnosed and we know exists - and we can then handle other bug reports separately. 😄

Hi there, I too am experiencing the same issue on our TeamCity builds with NUnit 3 console runner version 3.6.1, I also tried to use your Dev releases as mentioned in this thread with the command line parameters --remoting-wait-after-stop --remoting-shutdown-delay=5000 but with no success. I am still getting exit code -100 as below.

[15:32:27][Step 2/2] [15:32:27][Step 2/2] Errors, Failures and Warnings [15:32:27][Step 2/2] [15:32:27][Step 2/2] 1) Error : C:\BuildAgent\temp\buildTmp\7P2huJjnELwNLc2fGw7UZABZ81gGOi8W.nunit [15:32:27][Step 2/2] An exception occurred in the driver while loading tests. [15:32:27][Step 2/2] [15:32:27][Step 2/2] Server stack trace: [15:32:27][Step 2/2] at NUnit.Engine.Runners.DirectTestRunner.LoadDriver(IFrameworkDriver driver, String testFile, TestPackage subPackage) [15:32:27][Step 2/2] at NUnit.Engine.Runners.DirectTestRunner.LoadPackage() [15:32:27][Step 2/2] at NUnit.Engine.Runners.DirectTestRunner.EnsurePackageIsLoaded() [15:32:27][Step 2/2] at NUnit.Engine.Runners.DirectTestRunner.RunTests(ITestEventListener listener, TestFilter filter) [15:32:27][Step 2/2] at NUnit.Engine.Runners.AggregatingTestRunner.RunTestsSequentially(ITestEventListener listener, TestFilter filter, List`1 results, Boolean disposeRunners) [15:32:27][Step 2/2] at NUnit.Engine.Runners.AggregatingTestRunner.RunTests(ITestEventListener listener, TestFilter filter) [15:32:27][Step 2/2] at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs) [15:32:27][Step 2/2] at System.Runtime.Remoting.Messaging.StackBuilderSink.SyncProcessMessage(IMessage msg) [15:32:27][Step 2/2] [15:32:27][Step 2/2] Exception rethrown at [0]: [15:32:27][Step 2/2] at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg) [15:32:27][Step 2/2] at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type) [15:32:27][Step 2/2] at NUnit.Engine.ITestEngineRunner.Run(ITestEventListener listener, TestFilter filter) [15:32:27][Step 2/2] at NUnit.Engine.Runners.ProcessRunner.RunTests(ITestEventListener listener, TestFilter filter) [15:32:27][Step 2/2] [15:32:27][Step 2/2] Test Run Summary [15:32:27][Step 2/2] Overall result: Failed [15:32:27][Step 2/2] Test Count: 0, Passed: 0, Failed: 0, Warnings: 0, Inconclusive: 0, Skipped: 0 [15:32:27][Step 2/2] Start time: 2017-09-14 14:19:54Z [15:32:27][Step 2/2] End time: 2017-09-14 14:32:26Z [15:32:27][Step 2/2] Duration: 752.636 seconds [15:32:27][Step 2/2] [15:32:27][Step 2/2] Results (nunit3) saved as C:\BuildAgent\temp\buildTmp\7P2huJjnELwNLc2fGw7UZABZ81gGOi8W.nunit.xml [15:32:27][Step 2/2] Process exited with code -100

Originally (when using 3.6.1), the exception was “Exception encountered unloading AppDomain”

@ChrisMaddock Thanks for the explanation.

I didn’t close this with #283 but I would be curious if that PR solves this issue for anyone reporting.

Whatever we do, I’d like to understand what happens here. The code logic is that it that we start out with the local engine and only use any other engine (from the msi) if the version is newer. If we are somehow messing up the version info, I think we should fix that anyway.

I agree that this has not worked out as we hoped. I originally envisioned the user getting new features simply by replacing the engine. That’s not such a big deal for us, but it would be for third-party runners. However, AFAIK the third-party runners all require a local engine.

I see two ways to go, aside from figuring out what’s up with our versioning:

  1. Change the console runner to always use a local engine
  2. Change the semantics of the engine API itself.

@CharliePoole I agree that this is critical. The installed version may not be newer, but it may come first in the search path?

I think this is also related to #6. As a temporary fix for that we were planning on only using the local engine but that code is false currently. AFAIK, other runners also chose to only use the private engine.

NuGet is also our primary distribution method. Looking back at 3.6.1 because it has been out longer, the MSI was downloaded 7,702 times compared to 264,010 download of the NuGet package. MSI accounts for less than 3% of usage. My feeling is that if someone installs a specific version of a NuGet package in their solution, we should only be using that, not walking out and pulling out other versions. It is not what users expect or likely want.

If we were to replace the hard coded decision for the runners and replace it with a file, where would that file live, in one global spot or dropped in with the runner? Either way, I don’t like giving users the ability to override the decision made by the original runner developer.

To be honest, even though I wrote the original implementation, I am starting to dislike the idea of allowing users to load alternate engines.

  • It makes it really hard to debug issues
  • It potentially overrides the intentions or wishes of runner developers
  • I don’t think anyone ever asked for the ability, we just thought it was a good idea
  • I think it messes with addin discovery, especially with NuGet
  • Different developers on a team can see different behavior depending on their setup
  • It adds complexity and increases our testing footprint. I think issues like this show that only integration tests can catch problems like this issue and #6.

I am failing as well.

Hi @jnm2 We’ve updated to new version on last Wednesday and it hasn’t crashed since that. Unfortunately, we havn’t merged this change to every branch (but I’ve asked our developers to do this whenever possible), so I don’t have correct stats. But before updating we saw “exit code -100” 2-3 times a day.

@jnm2 I am having infra issue issues so not able to run tests with your new binaries for now. Will do the test when things go right.

FWIW this is biting us pretty frequently (many failed test runs per day) with the 3.7.0 ConsoleRunner. We have hundreds of builds combined with a set-up where we run a ConsoleRunner instance per test assembly, because we’ve only just migrated to NUnit3 so wrote our own tooling to do parallel assembly runs with NUnit2. This is probably quite a good stress test of things. :feelsgood:

I can carve off a bit of time tomorrow to look at the socket code around here and be a second pair of eyes on it if that’s useful, @jnm2.

I have tried the 3.8.0, but am getting no tests executing

C:\Projects\ena>“C:\Nunit3.8\tools\nunit3-console.exe” --labels=All --out=TestResult.txt “–result=TestResult.xml” c:\projects\ena\UAT\IntegrationTests\bin\debug\IntegrationTests.dll --test=uat NUnit Console Runner 3.8.0 Copyright © 2017 Charlie Poole, Rob Prouse

Runtime Environment OS Version: Microsoft Windows NT 10.0.14393.0 CLR Version: 4.0.30319.42000

Test Files c:\projects\ena\UAT\IntegrationTests\bin\debug\IntegrationTests.dll

Test Filters Test: uat

Run Settings DisposeRunners: True WorkDirectory: C:\Projects\ena ImageRuntimeVersion: 4.0.30319 ImageTargetFrameworkName: .NETFramework,Version=v4.6.1 ImageRequiresX86: False ImageRequiresDefaultAppDomainAssemblyResolver: False NumberOfTestWorkers: 8

Test Run Summary Overall result: Passed Test Count: 0, Passed: 0, Failed: 0, Warnings: 0, Inconclusive: 0, Skipped: 0 Start time: 2017-07-17 15:53:18Z End time: 2017-07-17 15:53:20Z Duration: 2.711 seconds

Results (nunit3) saved as TestResult.xml

C:\Projects\ena>

Thanks @HaomingFu.

I wonder if I just put in an extra delay if that would work.

I will run my tests a couple times, and the first test has already been kicked off. You should get my feedback by tomorrow PST.

@HaomingFu I don’t think there’s a workaround. The problem is that nunit3-console.exe finishes the tests and tells nunit-agent.exe to shut down, and nunit-agent.exe shuts down too soon, before it’s finished responding to the Stop remoting call. There’s no workaround that I’m aware of that you can do because it’s not directly related to anything your tests are doing. #223 goes more in depth, but it needs to be updated with the info in this thread.

@jnm2 release is out, so don’t kill yourself. If we need to, I can release a hotfix after the current release settles down and we see how it goes.

As for remoting, it is gone as soon as one of us replaces it 😄

@HaomingFu Thanks for letting us know. I will revisit this tonight.

Since I have not been successful in getting the SocketException on my machine, I might have to resort to really verbose logging and have you try things.

Another thought: I wonder if I need to wrap IClientChannelSink the same as the server sink? https://github.com/nunit/nunit-console/pull/223/commits/33e75f2c2ef3f3cfa4df0993540a82c9559ad153#diff-47f68aa1d792b3d290cd51f3b01306dcR69

The new Nunit console runner looks good since my nunit tests didn’t hit error -100 any more!

They do, but NUnit.Console is a meta-package. It brings in NUnit.ConsoleRunner and five extensions. Our MyGet feed provides a sort of running beta release as development proceeds. We should have a link to it, I agree, although it should not be so prominent that users who don’t really need a dev release start using it.

I think this goes back to my wanting the web site to be product focused. There’s an awful lot to say about each product, which would be easier if the second-level pages were all about individual products like the Console Runner.

@jnm2 Thanks. Now I have access to the nunit console executable 3.7. I will kick off a test shortly and it may take time to finish. Will update the thread when the test is done.