runtime: Seg fault in System.Drawing.Common tests on Unix

We’ve been plagued by seg faults on Linux in the System.Drawing.Common tests, e.g. this one on SUSE: https://mc.dot.net/#/user/stephentoub/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/91ec984d64908b3ab312bef6f6fa599f5ea1cee7/workItem/System.Drawing.Common.Tests/wilogs

2017-10-09 19:45:10,371: INFO: proc(54): run_and_log_output: Output: Discovering: System.Drawing.Common.Tests
2017-10-09 19:45:12,085: INFO: proc(54): run_and_log_output: Output: Discovered:  System.Drawing.Common.Tests
2017-10-09 19:45:12,654: INFO: proc(54): run_and_log_output: Output: Starting:    System.Drawing.Common.Tests
2017-10-09 19:45:13,601: INFO: proc(54): run_and_log_output: Output:    System.Drawing.Printing.Tests.PrinterSettingsTests.MaximumCopies_ReturnsExpected [SKIP]
2017-10-09 19:45:13,601: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"IsAnyInstalledPrinters\"
2017-10-09 19:45:13,604: INFO: proc(54): run_and_log_output: Output:    System.Drawing.Printing.Tests.PrinterSettingsTests.LandscapeAngle_ReturnsExpected [SKIP]
2017-10-09 19:45:13,604: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"IsAnyInstalledPrinters\"
2017-10-09 19:45:13,606: INFO: proc(54): run_and_log_output: Output:    System.Drawing.Printing.Tests.PrinterSettingsTests.Collate_Default_ReturnsExpected [SKIP]
2017-10-09 19:45:13,606: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"IsAnyInstalledPrinters\"
2017-10-09 19:45:13,620: INFO: proc(54): run_and_log_output: Output:    System.Drawing.Printing.Tests.PrinterSettingsTests.IsPlotter_ReturnsExpected [SKIP]
2017-10-09 19:45:13,620: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"IsAnyInstalledPrinters\"
2017-10-09 19:45:13,629: INFO: proc(54): run_and_log_output: Output:    System.Drawing.Printing.Tests.PrinterSettingsTests.Static_InstalledPrinters_ReturnsExpected [SKIP]
2017-10-09 19:45:13,629: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"IsAnyInstalledPrinters\"
2017-10-09 19:45:14,110: INFO: proc(54): run_and_log_output: Output:    MonoTests.System.Drawing.Imaging.PngCodecTest.Bitmap2bitData [SKIP]
2017-10-09 19:45:14,110: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"GetRecentGdiPlusIsAvailable2\"
2017-10-09 19:45:14,110: INFO: proc(54): run_and_log_output: Output:    MonoTests.System.Drawing.Imaging.PngCodecTest.Bitmap2bitFeatures [SKIP]
2017-10-09 19:45:14,110: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"GetRecentGdiPlusIsAvailable2\"
2017-10-09 19:45:14,111: INFO: proc(54): run_and_log_output: Output:    MonoTests.System.Drawing.Imaging.PngCodecTest.Bitmap2bitPixels [SKIP]
2017-10-09 19:45:14,111: INFO: proc(54): run_and_log_output: Output:       Condition(s) not met: \"GetRecentGdiPlusIsAvailable2\"
2017-10-09 19:45:15,499: INFO: proc(54): run_and_log_output: Output: /home/helixbot/dotnetbuild/work/c092864f-24dd-40c6-a4ad-db55569616e0/Work/af3a10a9-a3b8-4472-893f-3ef12b40a6c4/Unzip/RunTests.sh: line 87:  4576 Segmentation fault      (core dumped) $RUNTIME_PATH/dotnet xunit.console.netcore.exe System.Drawing.Common.Tests.dll -xml testResults.xml -notrait Benchmark=true -notrait category=nonnetcoreapptests -notrait category=nonlinuxtests -notrait category=OuterLoop -notrait category=failing
2017-10-09 19:45:15,540: INFO: proc(54): run_and_log_output: Output: Trying to find crash dumps for project: System.Drawing.Common.Tests
2017-10-09 19:45:15,540: INFO: proc(54): run_and_log_output: Output: No new dump file was found in /home/helixbot/dotnetbuild/work/c092864f-24dd-40c6-a4ad-db55569616e0/Work/af3a10a9-a3b8-4472-893f-3ef12b40a6c4/Unzip
2017-10-09 19:45:15,542: INFO: proc(54): run_and_log_output: Output: ~/dotnetbuild/work/c092864f-24dd-40c6-a4ad-db55569616e0/Work/af3a10a9-a3b8-4472-893f-3ef12b40a6c4/Unzip
2017-10-09 19:45:15,543: INFO: proc(54): run_and_log_output: Output: Finished running tests. End time=19:45:15. Return value was 139
2017-10-09 19:45:15,544: INFO: proc(58): run_and_log_output: Exit Code: 139
2017-10-09 19:45:15,545: ERROR: scriptrunner(87): _main: Error: No exception thrown, but XUnit results not created
2017-10-09 19:45:15,545: ERROR: helix_test_execution(83): report_error: Error running xunit None

This has been happening frequently for months, but I can’t find an existing issue for it.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 74 (74 by maintainers)

Commits related to this issue

Most upvoted comments

Yes, that would be good. We could check for a minimum installed version. Do you mind submitting a PR for that?

Let me give that a try. I’ll also check whether the libgdiplus-from-NuGet approach still works. May take a couple of days. Ping me if I haven’t got round to doing it by next week 😄. If someone else can do it faster, feel free to do so 😄 .

@safern can you please prioritize the investigation?

Sure.

I was able to get a a local repro when running the tests in a loop, it repro every 100 runs aprox. I assigned myself to gather a dump and look at what test is causing it and investigate why.

@filipnavara it would be great to have a PR that always logged the libgdiplus version here https://github.com/dotnet/corefx/blob/008d21c47c22b5757e09f6c3d346998004a0f985/src/System.Runtime.InteropServices.RuntimeInformation/tests/DescriptionNameTests.cs#L42 That way we can easily check if this question comes up in future in whatever test configuration.

We’ve confirmed offline that RedHat6 agents have a really old instance of libgdiplus and the engineering team is working on updating those guys.

@filipnavara FYI, looking at the dump it looks like this fails in GdipGetPathPoints. I know you have a PR open that maybe could fix this.

It would be good to get the path PRs (and the other open pull requests 🙃 ) merged so maybe this problem would be fixed or can be debugged cc @akoeplinger

From the dumpling that @danmosemsft pointed out to I got this stack trace:

00007FDBFAAAF990 00007FDC2266A868 DomainBoundILStubClass.IL_STUB_PInvoke(IntPtr, IntPtr, Int32, Int32, Int32, Int32)
00007FDBFAAAFA30 00007FDC22807339 System.Drawing.SafeNativeMethods+Gdip.GdipDrawRectangleI(IntPtr, IntPtr, Int32, Int32, Int32, Int32)
00007FDBFAAAFA80 00007FDC2280726A System.Drawing.Graphics.DrawRectangle(System.Drawing.Pen, Int32, Int32, Int32, Int32)
00007FDBFAAAFAA0 00007FDC228071EB System.Drawing.Graphics.DrawRectangle(System.Drawing.Pen, System.Drawing.Rectangle)
00007FDBFAAAFAD0 00007FDC22802807 System.Drawing.Tests.RegionTests.Complement_GraphicsPathWithMultipleRectangles_Success()

Which points us to the test that crashed and the path that it followed to cause the crash. Will go ahead and take a look at other dumps (if there are) in the other build failures to see if there is other crashing test, and then I’ll disable the crashing tests.

In the meantime, @qmfrederik would you mind taking a look at why this is crashing?

Here are the docs on how to load the dump: https://github.com/dotnet/coreclr/blob/master/Documentation/building/debugging-instructions.md#debugging-core-dumps-with-lldb

Let me know if I can help you.