MINGW-packages: [Investigation] The MSYS2 + meson + Python crash issue

Another issue (old one: https://github.com/msys2/MINGW-packages/issues/11864) to collect some information and what was tried so far.

I’ve created a small repo for reproducing the issue: https://github.com/lazka/python-crash-test

Any ideas regarding what we could try welcome.

The issue

meson fails with STATUS_ACCESS_VIOLATION sometimes like this when being called from ninja, and we haven’t found a way to reproduce it locally:

FAILED: libmy-shared-lib6.dll.p/libmy-shared-lib6.dll.symbols 
"D:/a/_temp/msys64/ucrt64/bin/meson" "--internal" "symbolextractor" "D:/a/python-crash-test/python-crash-test/project/_build" libmy-shared-lib6.dll "libmy-shared-lib6.dll.a" libmy-shared-lib6.dll.p/libmy-shared-lib6.dll.symbols 

This below is now out of date, see the followup answers for the cause

When has it started

  • When we updated from Python 3.9 to 3.10

Where it failed so far

Where it hasn’t failed so far

  • In MINGW32 / CLANG32
  • Locally, outside of CI

What makes the error go away

  • Running everything via powershell (and just setting PATH), and not from bash
  • Setting MSYS=winjitdebug
  • Downgrading to our last Python 3.9 mingw version (I created a rebuild for the last version we had)
  • Using the official CPython 3.9 installed via setup-python
  • Using the official CPython 3.11 installed via setup-python

What doesn’t make the error go away

Bisecting:

Testing with official CPython builds:

  • NO-ERROR: 3.9.13, 3.10.0-alpha.1, 3.10.0-alpha.2, 3.10.0-alpha.3 <-> 3.11.0-alpha.4, 3.11.0-alpha.5, 3.11.0-alpha.6, 3.11.0-alpha.7, 3.11.3
  • ERROR: 3.10.0-alpha.4, 3.10.0-alpha.5, 3.10.0-alpha.7, 3.10.11, 3.11.0-alpha.1, 3.11.0-alpha.3

Error started: v3.10.0a3…v3.10.0a4 Fixed: v3.11.0a3…v3.11.0a4

Bisect, see next post.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (10 by maintainers)

Commits related to this issue

Most upvoted comments

I can confirm that a mingw build of 3.11 also doesn’t crash: https://github.com/msys2-contrib/cpython-mingw/pull/139#issuecomment-1605917011

Small update. I’ve reviewed meson’s symbolextractor.py code. While there are few things that could be improved, it is not related to this issue.

One important thing I noticed that the issue happens only with llvm-nm which is preferred by meson, it does not happen with mingw’s nm. With this python-crash-test reproducer.

I think next step would be to remove meson and ninja (if possible) from the reproducer. I tested with script that only runs llvm-nm on the library. The reproducer is pretty minimal, good job on that, but removing components like meson would really help to focus on important parts.

But my fear is that since it doesn’t happen with Python 3.11 there will not be much interest (if at all) to find the root cause of this. I think during process exit some resources/fd are not in a valid state which causes the issue.