poetry: Poetry install fails for nested local dependencies and develop = false
- I am on the latest Poetry version.
- I have searched the issues of this repo and believe that this is not a duplicate.
- If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).
- Standard Python:3.8 docker image
- 1.1.4
Issue
I am attempting to install a poetry based application with a monorepo-like hierarchy. For simplicity sake the structure looks something like:
/lib
/pkg1
pyproject.toml
/pkg2
pyproject.toml
/apps
/app1
pyproject.toml
The apps/app1/pyproject.toml is completely empty other than a reference to one of the libraries: pkg1 = { path = "../../lib/pkg1", develop = false }
pkg1 has then a similar local dependency on pkg2: pkg2 = { path = "../pkg2", develop = false }
An example repo of this is here: https://github.com/TrevorPace/poetry-localdep-issue
When attempting to run poetry install
in apps/app1
to create the initial poetry.lock file I get the following issues:
Updating dependencies
Resolving dependencies... (0.1s)
Writing lock file
Package operations: 2 installs, 0 updates, 0 removals
• Installing pkg2 (0.1.0 /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg2)
• Installing pkg1 (0.1.0 /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1): Failed
EnvCommandError
Command ['/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/pip', 'install', '--no-deps', '-U', '/home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1'] errored with the following return code 1, and output:
Processing /home/trevor/git/trevor.pace/poetry-localdep-issue/libs/pkg1
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing wheel metadata: started
Preparing wheel metadata: finished with status 'error'
ERROR: Command errored out with exit status 1:
command: /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/python /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp8olvb_pd
cwd: /tmp/pip-req-build-845k8blz
Complete output (16 lines):
Traceback (most recent call last):
File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
main()
File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py", line 133, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/masonry/api.py", line 34, in prepare_metadata_for_build_wheel
poetry = Factory().create_poetry(Path(".").resolve())
File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/factory.py", line 91, in create_poetry
self.create_dependency(name, constraint, root_dir=package.root_dir)
File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/factory.py", line 242, in create_dependency
dependency = DirectoryDependency(
File "/tmp/pip-build-env-q6c1rb7q/overlay/lib/python3.8/site-packages/poetry/core/packages/directory_dependency.py", line 36, in __init__
raise ValueError("Directory {} does not exist".format(self._path))
ValueError: Directory ../pkg2 does not exist
----------------------------------------
ERROR: Command errored out with exit status 1: /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/bin/python /home/trevor/.cache/pypoetry/virtualenvs/app1-Jm016c7i-py3.8/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp8olvb_pd Check the logs for full command output.
at ~/.local/lib/python3.8/site-packages/poetry/utils/env.py:1074 in _run
1070│ output = subprocess.check_output(
1071│ cmd, stderr=subprocess.STDOUT, **kwargs
1072│ )
1073│ except CalledProcessError as e:
→ 1074│ raise EnvCommandError(e, input=input_)
1075│
1076│ return decode(output)
1077│
1078│ def execute(self, bin, *args, **kwargs):
So, it appears that we are actually able to correctly traverse the package tree when determining what to build, but when pkg1 actually is being built it’s generating something with a local reference in it still. My initial thought is that it is somehow related to https://github.com/python-poetry/poetry/issues/3148.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 14
- Comments: 21 (2 by maintainers)
Resurrecting this discussion as this bug is still current, and I suspect it may be related to #4051, and @sinoroc expresses some preference here which I think needs to be repudiated:
… etc.
Whether you like path dependencies or not is totally irrelevant to the bug in question. Poetry is supposed to be able to resolve dependency hierarchies recursively, and it should do so correctly for each kind of dependency it is designed to support.
In this instance,
pkg2
is being installed correctly, but whenpkg1
dependencies are being resolved at install time,poetry
fails to synthesize a package name from the path dependency inpkg1
’spyproject.toml
. Thedevelop = false
flag should instructpoetry
to express the package dependency as a name instead of a path when feedingpip
forpkg1
- that waypip
would find the matching package amongst the installed packages, mark the dependency as satisfied, and continue as normal. This is not an intractable problem.Poetry should be able to handle these kinds of dependencies - it’s central to the design goals of the tool. Breaking when
develop = false
isn’t acceptable; why take a boolean if only one value will work? With the path and thepyproject.toml
, Poetry has enough information to correctly resolve the dependency and providepip
with the information it needs to successfully installpkg1
.You don’t understand path dependencies or don’t think they’re “nice”, or whatever. That doesn’t mean that this isn’t a bug. Spending paragraphs of italicized text explaining that you don’t understand something is distracting. You explain that you don’t even understand the “monorepo” use case - if you don’t understand the use case, and don’t understand the failure, then don’t make the issue difficult to approach by building walls of text on top of each other.
This bug affects Poetry 1.1.6.
To address your comments on the mono-repo: There are significant advantages for a mono-repo when it comes to closed-source development of microservice-based projects. When it comes to development you are able to literally change files in the dependent libraries on the fly without having to first build/push to a private repo. By having all of your components in one repo it promotes quicker code sharing as well removes the need for any sort of “manual” versioning of the various components. As a result, you simply deal with one repo and merging the branches from it. A refactor, or split of services would not require removing or adding other repos, but instead be completely isolated to the branch. When it comes to deployment you can basically use git describe tags to get a unique version identifier for all the artifacts in that specific commit. Meaning that you guarantee all of the individual services are functionally aligned, there is no thinking about “What version does service X have to be with service Y?”. Working with private repos is pretty painful when it comes to making CI/CD pieplines as well as docker images.
When it comes to python/poetry, ultimately what we are looking for is a way to do development locally (IE install into a virtual environment for a given service) in a way that local paths are used, but for an install (into something like a Docker container) we want to be creating actual packages for each component in the dependency tree and installing them into site-packages. NPM for example doesn’t allow local development via linked resources, but instead when you install a local dependency it copies it (and all other dependencies) into your source tree. That makes it easy for deployment, but for development you need to rerun the install step after making any changes to the local dependency services or libraries.
My general opinion is that the whole use of the “develop” flag within the actual pyproject.toml file is counter-productive. If anything we should have a flag passed when running “poetry install” like “–local-dev” or similar. This way without the flag all packages would be copied into the virtual environment correctly with all local path dependencies resolved to just package names (similar to npm install or what we need for docker file packaging), but with the flag the package links (egg-link I think) would be installed into the virtual environment instead.
Ultimately, what we are trying to do is stack a hierarchy of dependencies and either let them be there for development, or flatten them into standalone packages based on what we actually need.
I think this is actually the issue, I described in another ticket: https://github.com/python-poetry/poetry/issues/2877. In our setup we are also using “develop = false” (which I don’t recall why we use it, but there was a reason - I will check again).
I think path dependencies are very valuable, especially in monorepo setups (where relative paths are usually fixed). Moving the dependencies to the first level in such a setup defeats the purpose of dependency resolution and adds quite some manual overhead (we have ~ 20 poetry packages with approximately 3 to 4 levels of dependencies within the repository, this is already quite hard to keep up).
The current state is problematic as it is inconsistent: Relative path dependencies can be specified and there is no explicit warning/error and no documentation that they would not work transitively. Transitive dependency resolution is quite a strong guarantee I would expect from a solver & installation tool. Looking at the error that is shown right now it seems like there is a state that should be invalid, if the feature is not supported. But as a user of the tool, after reading the documentation, it is not clear to me why this should not be the case.
I understand that from an internal standpoint it might be a use case that is harder to support, but I think that should not be a reason to dismiss the use case (unless there is a way to reach the same result, but without the problem with explicit mentioning of transitive dependencies). Right now this is stopping the wider poetry adoption in my organisation and looking at the bug tracker this seems to be something that was already asked multiple times.
I would be happy to help out getting this to work with some advice on the implementation. Or help to make sure other people do not run into the same problem by enhancing the error message and providing documentation on the use case and workarounds.
Thanks for sharing your debugging experience. I did not follow exactly, but I am sure it will be helpful to the maintainers when they get time to visit this ticket.
Personal opinion:
I am not fan of path dependencies at all. I see how it could be helpful in many cases, but I feel like actually it can work in very limited cases only. So as a general rule I would really recommend anyone to avoid them as much as possible.
This to me looks like the typical kind of features that are added for some very specific and straightforward use cases in mind, but for the creator of the feature it is hard to tell when it is going to break, so no explicit limitations are indicated in the documentation. But still there is a point where it is going to break (same story with editable installations).
In my mind, these features are here to help during development only, for example. But users somehow push it a bit too far out of the feature’s comfort zone, and it shows its limitations and breaks. In particular I would not expect path dependencies to work on 2nd degree dependencies, to me it seems like expecting too much out of this.
Basically I would limit path to direct dependencies. And in particular to scenarios such as this:
Just as a note, while the workaround worked until
poetry==1.1.6
, the fix for #4202 also broke the workaround, and now dependencies can also not be installed when usingdevelop = true
@sinoroc: Thanks for the pointers, I looked into them earlier, but I wasn’t sure they are fully overlapping.
@TrevorPace: We have exactly the same use case and use the mono repo (or rather mini mono repo, so splitting an application into smaller packages but in one repository) for the same advantage. To not further fuel the discussion: I think the value of such a setup depends heavily on your application/library and even more your infrastructure for CI/CD, deployment and even your organisation.
For CI/CD we build the packages locally and install the packages using pip with a requirements.txt built from poetry. So we don’t need the round trip via a (private) pypi just for tests. But for development we would like to have the nested path dependencies working (right now we track all transitive dependencies manually) so that we can do changes in multiple packages and have the changes instantly running or test them. I think what you proposed would be a perfect fit for our use case as well.
Using
Poetry
version1.1.12
the issue still happens when you specify path dependencies withdevelop = false
, however withdevelop = true
it seems to work properly.@malte-klemm-by
I’m not a maintainer, but I will try to look at this again, see if I can pin point more accurately what is going wrong here.
Digression:
I believe the potentially surprising and/or unreliable behaviour becomes quite clear with more knowledge of Python’s packaging story in general (not just poetry). In my mind it is clear that things like path dependencies are not first class citizens at all. And from my point of view I also do not know why they should be. If I could I would probably get rid of them entirely.
I guess I see how path dependencies could be somewhat useful for things like “mono-repo”, although I do not have much knowledge about those (seen from afar I do not see the advantages of mono-repos). Anyway, I do not understand why mono-repos want to bypass the distribution phase. Mono-repo or not, my impression is that it should still be possible to build the individual projects and upload them to an index (PyPI or private repository), and thus get rid of path dependencies.
Also I must admit I do not really understand why people bother with building individual projects in a mono-repo. This all seems quite counter-productive, a weird mix of things, that result in doing many things that are against best practices. Why do you need dependency resolution in a mono-repo? Aren’t all dependencies supposed to be in the repo, even the external ones? All these things do not make sense to me. Why go for a radically different approach (mono-repo) and then complain that tools do not work with this approach? I always assumed that mono-repos used radically different tools. I am curious about these things.
There is a bit of work in progress that could potentially help with these kinds of project structure, maybe it is of interest to you:
So, after a bit of a deep dive in the code.
I believe what is happening is:
During dependency solving of app1, the pkg1 dependency is evaluated with Provider().get_package_from_directory(). This makes a call to PackageInfo().from_directory(), which essentially identifies pkg2 as a poetry package and ultimately adds it’s own dependencies into the full dependency solver. Pkg2 being defined as a non-develop entity by pkg1’s pyproject.toml, and thus is installed with
pip install --no-deps <pkg2-path>
. As pkg2 contains a pyproject.toml this causes pip to basically invoke poetry again, and it is correctly installed, as it has no local dependencies.After pkg1’s dependencies are installed it’s time to install pkg2. Pkg2 (like pkg1) is identified as a directory package and is again installed with
pip install --no-deps <pkg2-path>
. This is where the problem is though. Because when poetry.masonry.api gets invoked it’s constructing a wheel but the dependencies get evaluated again and despite us using --no-deps it’s still evaluating if they are still valid when we are creating the wheel. Poetry can’t evaluate that path to “…/pkg2” because all of this is being done inside a temporary build directory. Ultimately, the Factory() created by poetry.masonry.api should be skipping checking of local dependencies again somehow and when creating wheels, and should be only containing package name, not the pathIn the case of development dependencies installed locally though this problem is avoided because a custom Builder is used in the PipInstaller which avoids the invocation with pip (and thus the isolated wheel generation step).
Slightly off-topic, but I just want to add as well, that this whole
develop = true/false
thing seems like a bad idea. Are there actually development situations where we want to install local directories into our virtual environment and not just reference them?Couldn’t we just always do a local install with our custom Builder, except when providing a build-flag to poetry during install:
poetry install --copy-local-deps
This way for development we do nothing, and for creating Docker files we would just add the flag and disable creation of virtual environments to have everything installed into the filesystem.
If we still wanted to decide to install packages or not we could use the optional flag accordingly.
I’m not sure about that, because when using “develop = true” it’s actually correctly computing the correct paths for pkg2 when creating the .pth files in the virtual environment. There must be actual code looking into the the pyproject.toml of pkg1, resolving the dependencies and installing them into the virtual environment (or pkg2 would be neither installed nor correctly resolved). It seems like when it comes to actually creating the wheel or whatever intermediary it is passing to pip it isn’t correctly updating the dependency. Which is why it works fine for pkg1 (because it has no other local dependencies to resolve). It also works fine for the main application, because it doesn’t actually create that intermediary (just the app1.egg-info/).
I’ll take a look at the poetry source to see what’s up.
OP provided a repro for the issue they were reporting, at https://github.com/TrevorPace/poetry-localdep-issue. And that repro works just fine with poetry 1.3.2
This issue should be closed: the problem that it reported has been fixed.
If others of you in this thread have different problems, please raise new issues.