poetry: Poetry doesn't try public pypi when private pypi included
- [ X ] I am on the latest Poetry version.
- [ X ] I have searched the issues of this repo and believe that this is not a duplicate.
- [ X ] If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).
- macOS : 11.1
- Poetry version: 1.1.5
Issue
When using a private pypi repo, the public repo is no longer being checked. I can get everything working again by adding
[[tool.poetry.source]]
name = "pypi-public"
url = "https://pypi.org/simple/"
But the wasn’t needed in the past.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 17
- Comments: 35 (17 by maintainers)
Commits related to this issue
- kill poetry because of https://github.com/python-poetry/poetry/issues/3855 — committed to rtimmons/mongo-etls by rtimmons 3 years ago
- Update repositories.md Clarifies default vs secondary (see discussion in #3855) — committed to jonapich/poetry by jonapich 2 years ago
- docs: improve documentation of repositories/package sources (#5605) Resolves some discussion in #3855 — committed to python-poetry/poetry by jonapich 2 years ago
I’m also finding this is an issue…
This issue isn’t reproducible on poetry@master. Closing.
Note that setting repository to default will disable default PyPI.
https://python-poetry.org/docs/master/repositories/#project-configuration
When you combine
secondary = true
anddefault = false
, thepoetry.lock
does behave correctly.e.g.:
There’s no need to redefine the public pypi in this case. The
poetry.lock
correctly adds the private repo exclusively to the libraries I marked withsource = "internal"
Our private registry is configured to redirect to
pypi.org
on missing packages. I think your test is flawed without a real repository, since both of your repositories a) contain all packages b) provide amazing performance. In the real world, you would be hitting your custom server first, which is probably slower thanpypi.org
.I just tested locking a huge project. Locking with our registry first took 4m40, but configuring it as non default, secondary and targetting only the few relevant packages brought that down to 4m14 (that’s
poetry lock --no-cache
). In that test, there are 4 packages to fetch on our server, and 164 from pypi. There are also 2 git sources which slow down the whole process quite a bit. So overall, the difference here exists, but is clearly in the nice-to-have range.The huge difference though, is that we were able to slash down the size of our private registry server with this one easy trick 😅 when too many clients are doing a
poetry install
simultaneously (think a bunch of docker builds and automated tests kicking in parallel), the server occasionally spits out a 5xx during poetry install. We were able to fix this problem by using thedefault false, secondary true
trick.I improved (I think) the documentation in #5605 but I think some effort should be made to support this use case better. The options are simply misleading for anyone who didn’t take the time to carefully read that documentation section.
This is the dark side of the rules:
source=
at a dependency level has no effect (misleading)secondary=true
has no effect (misleading)default=false
has no effect (misleading)default=true
had no effect (misleading)default=false
andsecondary=true
finally makessource=
work as intended! (profit)We have to think about the developer’s thought process here. When you add
source=
the first time to a dependency, there’s very little chance that you’ll know thatdefault=false
andsecondary=true
must be added. If the repo information is added without a good understanding of its documentation, you’re not just adding a dependency, you’re actually setting the private registry as default for all locking and install needs. Since GitHub hides thepoetry.lock
diff most of the time (it’s too large), a lot of devs won’t notice the addition of 100s ofpackage.source
to their lock file and will just go with it. It took us a couple 5xx to understand what was going on…I would say that if the user provided
source=
information, the plan is to use the private registry as little as possible. When the user sets thesource=
, poetry should use it only when requested. If no package have asource=
, then the repository information is most likely to be used as much as possible.Simply don’t. If the user wants a transitive to use the private registry, it can be added to the dependencies with the
source=
specified 🤷🏻♂️ That’s how someone could use e.g. a forked version ofurllib3
even though only requests was needed.urllib3
would be pushed to the private registry, andurllib3
andsource=
would be added to the pyproject file. It feels wrong that the customurllib3
will maybe be used by everyone in the company who didn’t think about this and set the private registry as default by mistake.I think that setting a registry as the first one to be checked should be the “you need to specify an option” way, and using the registry only when
source=
targets it really should be the default scenario.has the situation significantly improved since the issue was filed? My recollection is it made a difference of one to two orders of magnitude in seconds for my project.
Ah, you might be right that it’s caused by the recursive dependencies. It’s been a while since I tested this so I don’t remember exactly, but I did see lookups in the private repo when setting everything to source pypi.
I’m affected by this. Tried in 1.2.0a2 for good measure… At first glance it looks fixed: poetry.lock no longer updates with the incorrect default repository.
However, it’s apparent that poetry still checks the non-default repository for every package even if a valid package has been found on the default one. This might be intended, but it also slows operations down significantly. Common scenario: App depends on a handful of packages in a private repository, and a bunch of packages in pypi. A quick
poetry update -vvv
will reveal what’s going on. In my case i’m using a private gitlab pypi repo and:This makes
poetry update
a 35 second operation on a warm cache for my project. If I remove the custom repository, same project, it completes in under 1 second.You’re right, only secondary=true is needed. I think that was maybe an old bug, or just a manipulation error when I played around this months ago.
The same problem occurs when you need to specifically pin a version of a transitive dependency because reasons, no need to involve private registries to fall into this trap. I can’t vouch for everyone’s best practices, but if you need to add such an edge case to your
pyproject
, you comment it as such so that everyone knows what it’s about.In fact, the same problem occurs if someone adds dependency A for new python code, then someone alters the code later and removes the usage. Unless you actively search the code base for more usages of some random import you just removed, you’re going to be left with one unused library. My opinion is that it’s a non-issue / user-error, this isn’t something poetry should be concerned about.
👍🏻
Poetry 1.1.12 and Windows, python 3.9. I just tested it again:
Given this:
The lock doesn’t contain any repository information. But once I do this:
Then the lock contains the repository information for the requests package exclusively.
Once I change to this:
Then suddenly every single package in the lock contains my repository information (this seems to be a bug!).
This seems to work though:
With the above, I don’t see any repo information. When I add
source = "internal"
then I get the same result as the first example: the repo information is added to the requests package and the other packages don’t have any repo information.Will there be a fix supplied soon or some workaround? I am pretty stuck with trying to lock my deps with private repo and pypi.