pex: pex build fails due to existing work-directory
Beginning with version 2.1.105 building the pex file in our CI pipeline fails with the following message:
…/python3.8/site-packages/pex/atomic_directory.py:176: PEXWarning: [pid:XX, tid:XXX, cwd:…]: After obtaining an exclusive lock on <PEX_ROOT>/isolated/.2f4fc85fa2be055a2975ce1147100c0d5c7e663a.atomic_directory.lck, failed to establish a work directory at <PEX_ROOT>/isolated/2f4fc85fa2be055a2975ce1147100c0d5c7e663a.workdir due to: [Errno 17] File exists: ‘<PEX_ROOT>/isolated/2f4fc85fa2be055a2975ce1147100c0d5c7e663a.workdir’ pex_warnings.warn( …/python3.8/site-packages/pex/atomic_directory.py:187: PEXWarning: [pid:XX, tid:XXX, cwd:…]: Continuing to forcibly re-create the work directory at <PEX_ROOT>/isolated/2f4fc85fa2be055a2975ce1147100c0d5c7e663a.workdir. pex_warnings.warn( Failed to spawn a job for …/bin/python: [Errno 17] File exists: ‘<PEX_ROOT>/isolated/2f4fc85fa2be055a2975ce1147100c0d5c7e663a.workdir/pex/./venv’
It seems to have to do with #1905 introduced in version 2.1.105, but we have no clue, why this is happening in our CI pipeline, while building the .pex file on MacOS developer machines works. It looks like something else is creating that directory, but there is only one pex command in the pipeline job and the PEX_ROOT is not cached.
Our build environment uses:
- the Red Hat UBI 8.4 Docker image
- Python 3.8
- poetry 1.1, which manages pex as a dev-dependency
Then we build the pex with
poetry run pex --inherit-path --python=python3.8 --requirement requirements.txt --find-links dist/ our_module --output-file dist/final.pex
Any idea why this is happening or what else we could check would be helpful.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (21 by maintainers)
Commits related to this issue
- Fix `execute_parallel` "leaking" a thread. Although the thread was not leaked per-se, it could run after `execute_parallel` returned which could cause parallel atomic_directory posix locks to fail. ... — committed to jsirois/pex by jsirois a year ago
- Fix `execute_parallel` "leaking" a thread. Although the thread was not leaked per-se, it could run after `execute_parallel` returned which could cause parallel atomic_directory posix locks to fail. ... — committed to jsirois/pex by jsirois a year ago
- Fix `execute_parallel` "leaking" a thread. (#2052) Although the thread was not leaked per-se, it could run after `execute_parallel` returned which could cause parallel atomic_directory posix locks ... — committed to pex-tool/pex by jsirois a year ago
- Wrap inter-process locks in in-process locks. This is needed to have independent POSIX fcntl locks in the same process by multiple threads and also needed whenever BSD flock locks silently use fcntl ... — committed to jsirois/pex by jsirois a year ago
- Wrap inter-process locks in in-process locks. (#2070) This is needed to have independent POSIX fcntl locks in the same process by multiple threads and also needed whenever BSD flock locks silently us... — committed to pex-tool/pex by jsirois a year ago
@james-johnston-thumbtack thank you so much for the repro case. As is always the case, these are absolute gold and make debugging roughly infinitely easier and quicker than it is otherwise.
I’ll be damned, this fixes:
I really don’t know how I continually glossed over / missed the jobs.py Thread spawn.
I need to think this through a bit more, but I think this solves the issue. The bsd locks are still needed for the lock file handling of parallel downloads (and the later added parallel downloads of PEP-691 metadata), but the old-school plain old Pex code paths are made safe with the lone thread join ensuring its shut down before serially continuing to the next lines of code.
@jsirois Thanks for the quick response. Setting
_PEX_FILE_LOCK_STYLE=bsdsolved the problem. Would you suggest to set it as a workaround until there is a fix for the locking?@christopherfrieler that warning message looks like the one in the 2.1.112 release. I added it in #1961 to help debug a probable race or wrong POSIX assumption that has been hard to track down. I know this is painful for you, but I’m very happy to have a repro case from you! Can you try setting
_PEX_FILE_LOCK_STYLE=bsd(added in #1962) in your CI environment and see if that changes anything?