runtime: iOS: building with LLVM causes some exception catch clauses to not work

Description

See the following test code: https://gist.github.com/rolfbjarne/f5c1a3697343c5e9bb6cb8e5b796d328#file-appdelegate-cs-L23-L73

When executed on an iOS device (AOT compiled), this produces the following output:

TCP connection failed when selecting 'hostname': fe80::6aa3:6295:30dd:b552 and 'port': 49681. System.Net.Internals.SocketExceptionFactory+ExtendedSocketException (65): No route to host [fe80::6aa3:6295:30dd:b552]:49681
   at System.Net.Sockets.Socket.DoConnect(EndPoint , SocketAddress ) in System.Net.Sockets.dll:token 0x6000127+0x68
   at System.Net.Sockets.Socket.Connect(EndPoint ) in System.Net.Sockets.dll:token 0x600010a+0xca
   at System.Net.Sockets.Socket.Connect(IPAddress , Int32 ) in System.Net.Sockets.dll:token 0x600010b+0x62
   at System.Net.Sockets.TcpClient.Connect(String , Int32 ) in System.Net.Sockets.dll:token 0x60001b7+0xc7
--- End of stack trace from previous location ---
   at System.Net.Sockets.TcpClient.Connect(String , Int32 ) in System.Net.Sockets.dll:token 0x60001b7+0x158
   at System.Net.Sockets.TcpClient..ctor(String , Int32 ) in System.Net.Sockets.dll:token 0x60001b4+0x56
   at MySingleView.AppDelegate.<>c__DisplayClass3_2.<SelectHostName>b__0(Object v) in /Users/rolf/work/maccore/squashed-onedotnet/xamarin-macios/tests/dotnet/MySingleView/AppDelegate.cs:line 49
[...]
Selected host name: 

if I enable LLVM, the following happens:

Unhandled managed exception: No route to host [fe80::6aa3:6295:30dd:b552]:49681 (System.Net.Internals.SocketExceptionFactory+ExtendedSocketException)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
Unhandled managed exception: No route to host [fe80::14a6:1e61:2f4e:e2c5]:49681 (System.Net.Internals.SocketExceptionFactory+ExtendedSocketException)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()

=================================================================
	Native Crash Reporting
=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

=================================================================
	Native stacktrace:
=================================================================

It’s a rather strange bug, because in my original test it doesn’t crash this way, the SelectHostName code causes a lot of other test failures due to Assert.Throws<FooException> statements not actually catching the FooException, but then the general NUnit catch handler catches these exceptions, and turn them into test failures. The baffling effect was that running the tests outside of our test harness (without needing to call the SelectHostName method), the tests passed.

In any case, let me know if you can reproduce with this information, or I’ll create a test case you can use (which will likely require building a custom branch of xamarin-macios).

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 36 (36 by maintainers)

Commits related to this issue

Most upvoted comments

Got it @rolfbjarne ---- @imhameed has been working on narrowing down the issue, so we should be able to get in a fix. @imhameed please update this with your progress

JFYI I was able to reproduce this on maccatalyst-arm64. It may be slightly easier to debug than the iOS version.

Curiously, when compiled as part of the FunctionalTests framework in dotnet/runtime it run just fine. When compiled using Xamarin SDK it failed <strike>100%</strike> 50% of the time (unless it’s run with MONO_LOG_LEVEL=debug when it suddenly stops failing). There could be a difference in the options used for the AOT, or the way the runtime is linked or initialized.

Also related (tracking a longer-term cleaner fix): https://github.com/dotnet/runtime/issues/54176

Doesn’t seem like a consistent repro. @imhameed - if you determine consistent repro steps and have a low risk fix, we will consider backporting to 6.0 - For now moving to 7.0

This is blocking LLVM for us. It’s 100% consistent on our main test suite, where it breaks a huge number of tests. There’s no way we can release LLVM support with this bug.