setuptools: package data in subdirectory causes warning
setuptools version
62.3.2
Python version
3.10
OS
Debian with conda
Additional environment information
No response
Description
pyopencl has OpenCL files and some headers in a subdirectory pyopencl/cl
and they are included as package_data
so that the python module can find them.
package_data={
"pyopencl": [
"cl/*.cl",
"cl/*.h",
"cl/pyopencl-random123/*.cl",
"cl/pyopencl-random123/*.h",
]
},
With new setuptools, there is a warning saying
############################
# Package would be ignored #
############################
Python recognizes 'pyopencl.cl' as an importable package, however it is
included in the distribution as "data".
This behavior is likely to change in future versions of setuptools (and
therefore is considered deprecated).
Please make sure that 'pyopencl.cl' is included as a package by using
setuptools' `packages` configuration field or the proper discovery methods
(for example by using `find_namespace_packages(...)`/`find_namespace:`
instead of `find_packages(...)`/`find:`).
You can read more about "package discovery" and "data files" on setuptools
documentation page.
cc @inducer
Expected behavior
No warning
How to Reproduce
- clone https://github.com/inducer/pyopencl
- install numpy
- Run
python setup.py install
Output
$ python setup.py install
running install
/home/idf2/miniforge3/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/home/idf2/miniforge3/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running bdist_egg
running egg_info
writing pyopencl.egg-info/PKG-INFO
writing dependency_links to pyopencl.egg-info/dependency_links.txt
writing requirements to pyopencl.egg-info/requires.txt
writing top-level names to pyopencl.egg-info/top_level.txt
reading manifest file 'pyopencl.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'pyopencl.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
/home/idf2/miniforge3/lib/python3.10/site-packages/setuptools/command/build_py.py:153: SetuptoolsDeprecationWarning: Installing 'pyopencl.cl' as data is deprecated, please list it in `packages`.
!!
############################
# Package would be ignored #
############################
Python recognizes 'pyopencl.cl' as an importable package, however it is
included in the distribution as "data".
This behavior is likely to change in future versions of setuptools (and
therefore is considered deprecated).
Please make sure that 'pyopencl.cl' is included as a package by using
setuptools' `packages` configuration field or the proper discovery methods
(for example by using `find_namespace_packages(...)`/`find_namespace:`
instead of `find_packages(...)`/`find:`).
You can read more about "package discovery" and "data files" on setuptools
documentation page.
!!
check.warn(importable)
running build_ext
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 2
- Comments: 54 (32 by maintainers)
Commits related to this issue
- Use find_namespace: instead of find: See <https://github.com/pypa/setuptools/issues/3340> — committed to wheelodex/wheelodex by jwodder 2 years ago
- Use find_namespace: instead of find: See <https://github.com/pypa/setuptools/issues/3340> — committed to jwodder/pyrepo by jwodder 2 years ago
- v2022.6.20 — Date releases, versioningit, et alii - Give `pyrepo release` a `--date` flag - Give `pyrepo begin-dev` a `--no-next-version` flag - `pyrepo release`: Don't upload projects with "Private"... — committed to jwodder/pyrepo by jwodder 2 years ago
- Include data packages explicitly. https://github.com/pypa/setuptools/issues/3340#issuecomment-1158859089 — committed to lemon24/reader by lemon24 2 years ago
- chore: Fix static file package data includes in setup.py This was beginning to show deprecation warnings in setuptools. See here: https://github.com/pypa/setuptools/issues/3340 — committed to globus/django-globus-portal-framework by NickolausDS a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were seeing de... — committed to rb-determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py (#6602) First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were... — committed to determined-ai/determined by rb-determined-ai a year ago
- chore: cleanup our setup.py (#6602) First, use find_namespace_packages instead of find_packages because it includes directories containing only data files (non-python files) as "packages". We were... — committed to jerryharrow/determined by rb-determined-ai a year ago
- Resolve pypa/setuptools#3340. This commit resolves upstream issue pypa/setuptools#3340 in which `setuptools` fundamentally breaks backward compatibility with the entire ecosystem of existing `setupto... — committed to betsee/betse by leycec a year ago
Hi @leycec, I understand you’re frustrated with the changes here, but please try be more more respectful and considerate in your communication in the future.
As a project of the PyPA, everyone in this issue tracker is expected to follow the Python Community Code of Conduct. The expectation is that everyone interacting here should be courteous when raising issues and disagreements.
Specifically, calling the setuptools maintainers “insane lunatics” is unacceptable: it’s not constructive, it’s not welcoming or inclusive, and it easily qualifies as harassment.
Additionally, your previous comments in this issue tracker (“Heads need rolling (especially those currently attached to the still-functioning torsos of managerial project leads)”) is a clear example of violent language directed against another person, and is also unacceptable.
In short: we don’t do this here. You are welcome to continue participating in this project, but if you continue to violate the code of conduct here, you will no longer be permitted to participate.
Insane lunatics who are publicly disembowelling
setuptools
while distraught children are franticly screaming, please stop publicly disembowellingsetuptools
while distraught children are franticly screaming.This means you, @abravalheri – and everyone else with
setuptools
push authority who mistakenly believed that Masaki Kobayashi’s seminal masterpiece “Harakiri” was not, in fact, a brutal dissection of authoritarian militancy under entrenched hierarchy but instead an exemplary paragon of software best practices in the enlightened post-Python 2.7 era.So, @leycec. Bro. What’s The Big Deal, Yo?
Saliently, replacing
find_packages()
withfind_namespace_packages()
is not a valid solution for most projects. Why? Becausefind_namespace_packages()
erroneously matches all repository subdirectories1 in the root repository as importable packages.1. …possibly excluding dot directories, not like it particularly matters.
</shrug>
Clearly, most repository subdirectories in the root repository are not importable packages; they’re superfluous workflow subdirectories like
{package_name}.egg-info/
,build/
,dist/
,docs/
,pip-wheel-metadata/
, and the list just goes on and on. Moreover, the set of these subdirectories significantly changes over time and is almost entirely outside the control of downstream developers. Explicitly listing these subdirectories in theexclude
parameter tofind_namespace_packages()
is thus infeasible. Therefore, replacingfind_packages()
withfind_namespace_packages()
is not a valid solution for most projects.So, @leycec. Bro. How Did You Fix This?
Thanks. I’m so glad you asked. The obvious solution is just to abandon
setuptools
.That’s what everybody else has done. But I’m curmudgeonly. I have a grubby beard and live in a mildew-infested cabin in the Canadian wilderness. People like me are disinclined to do what we should. Instead, I did what I shouldn’t.
I continue using
setuptools
despite its repeated outbursts of insanity. In this case, I compelledsetuptools
to obey my perfidious will via ludicrous boilerplate which I will now copy-and-paste into every Python project I maintain – much to the shared agony of junior developers and my wife, who must now suffer my delusions in silence. This is that boilerplate:I didn’t make insanity. I only break it over my arthritic knee.
So, @leycec. Bro. Could You Like Stop Talking?
The party ends abruptly when @leycec walks through the door. The silence is deafening. I’m pretty sure the silence gave me tinnitus. Since everyone fled, I’ll say one last thing to the empty room:
From my perspective, I have specified
package_data={'my_package': ['data_folder/*.*']}
the intention is clear that I consider it a folder of data files ofmy_package
, I don’t really care that python considers it a namespace package. It would be really nice if setuptools added the namespace packages for me instead of giving a warning.The above discussion covers how to go about building a correct distribution without using deprecated
setuptools
functionality. Thanks all for that!My question here is more philosophical than practical. It seems like
setup(packages=...)
andMANIFEST.in
are at least partially overlapping in functionality. In our MANIFEST.in we specify exactly what files and directories to include in distribution bundles. It seems redundant to have to specify similar information (the directories part) via thepackages=
configuration. Couldn’t that list of packages theoretically be inferred from theMANIFEST.in
contents?I’m faced with the inverse situation: my
packages=
andMANIFEST.in
are well defined, with exactly the files I want included in my wheel. There are files within the package that I don’t want to see included (e.g. tests, sass files, non minified js, etc.). Now, setuptools adds them to the wheel, with no “Stop doing this, I know what I’m doing” flag that I can see. Am I missing something?I was following this issue because I was very confused (as a newbie) by the warning that came up. Can I suggest a further edit to the warning message: Currently {importable!r} is only added to the distribution because it may contain data files… –> Currently {importable!r} has been automatically added to the distribution because it may contain data files… This is my understanding of what is actually happening (i.e. automatic inclusion). If this suggested change is wrong, then I’m still not following how this works…
Hi @shakfu, thank you very much for sharing your thoughts. Please see my comments below:
Please note that the “directory is a package” behaviour introduced in PEP 420 is actually quite old. The PEP has been approved in 19/Apr/2012 and the specified behaviour implemented in Python 3.3, over 10 years ago which is a lot in “software development years”. It is safe to assume that this behaviour is stable and that at some point in their carreers Python developers will indeed learn what a Python package means and how adding a directory nested somewhere under one of the entries of
sys.path
corresponds effectively to create a Python package that can be regularly imported as any other package via animport
statement.Could you please clarify in what sense the solutions presented in the warning message are insecure? The error message presents to the user a couple of suggestions (to be chosen accordingly to what fits better their use case) which include to manually add the missing entries to the
packages
configuration option, or to use a convenience function provided by setuptools. In both cases, the procedure is very mature and stable.I understand that this is a popular interpretation of what the concept of packages (and/or namespace packages) might mean for Python and I see where it comes from. But I don’t think this interpretation is backed by the Python implementation and the way it works…
You can
import
directories that don’t contain.py
files, and having packages for holding non-Python files is actually a very useful feature[^2]! I never found official documentation saying that packages/namespace packages are meant to contain importable executable code and cannot be used to contain only non-Python files, I don’t think there is an official stance on that.[^2]: They make it really easy to find non-Python files them runtime using
importlib.resources
. You can also implement extensions/plugin systems on top of it, and etc…If these non-Python files are not meant to be installed in the end-user’s machine, I believe it is a matter of properly configuring
packages
/package_data
/include_package_data
/exclude_package_data
/MANIFEST.in
so that they are part of thesdist
but not part of thewheel
. Otherwise, if they end up nested somewhere under an entry ofsys.path
, they will be import packages, effectively.If you really don’t like the idea of having these directories as importable packages, then the alternative is to use
data-files
, which will translate into a special directory in the wheel file ({name}-{version}.data/data/
). In turn,pip
will stall them in a different location that will not be nested somewhere insys.path
. It is a lot of effort to align expectation and implementation, for most people it might just be worth to adapt their expectations.I think that simply dropping the warning would be unwise. The purpose of the warning is for developers to align their configuration to their expectations. If they want certain directories to be installed somewhere under
sys.path
, this effectively mean that they are asking setupotools to include certains packages/subpackages into the wheel. Within setuptools, that desire is captured by thepackages
configuration option.We need users to start clarifying their configuration, because the next step is to fix other related bugs (see https://github.com/pypa/setuptools/issues/3340#issuecomment-1219321087), and it would be bad if suddenly some folders are missing from packages.
I don’t know of anyone currently attempting to introduce a data / code distinction (and what that will mean for Python packages and import system) via a new PEP. Personally, I like the status quo and I think it works quite well. Since there is no concrete plans for such change in the ecosystem, there is no foreseeable risk of conflation, and we don’t need to treat this situation as “interim”. As far as we know this is the stable behaviour that we should be targetting to achieve after 10 years of transition.
There is another approach[^4] that I have absolute no problems in considering and actually would welcome with open arms: if a member of the community is willing to contribute (i.e. design, discuss, find consensus, implement, document, fix, support …) a different way of configuring setuptools that is more conceptually self-evident and less prone to ambiguity than
packages
/package_data
/include_package_data
/exclude_package_data
/MANIFEST.in
[^3]. Extra requirements for such solution are: backward compatibility and easy maintenance.It is a tough challenge which I don’t have the resources to tackle myself, but I would be very grateful if someone else can.
[^3]: In some sense, the automatic discovery (when the user does not specify
packages
) that was introduced a couple of years ago is meant to be easier and less confusing. But automatic discovery is not a fit for all and edge cases still require playing withpackages
/package_data
/include_package_data
/exclude_package_data
.[^4]: But that is orthogonal to the warning and next steps discussed here.
hey there -
why does this warning refer to find_namespace when the current documentation indicates that find_namespace is only for namespace packages ? can this documentation please be updated to indicate it now has another use case for packages that are explicitly not namespace packages also ?
should read something like:
or otherwise can some document please be added that exactly explains the situation the warning is detecting and how we are to treat this situation as a “namespace package”. Projects with datafiles are ubiquitous and it’s not reasonable to launch a new warning that refers to off-label use of some obscure feature of setuptools in a vague way as how to resolve.
I have been stumped for hours by this issue. I just want to add data folders to my package, and there is no clear instructions to do this. How can we specify in a pyproject.toml which folders we want to recursively include (and especially how to specify a nested folder from where to start looking)? The documentation does not provide any example for nested structures.
EDIT: Also this issue would be a potential solution if implemented: https://github.com/pypa/setuptools/issues/3341
EDIT2: For example, here is my repository producing (lots of) warnings: https://github.com/lrq3000/pyFileFixity/tree/ea447c548c9b736ea3c2c76bafa61bf1b51af4ca
And YES I can suppress the warnings with auto discovery by removing all the content of
tool.setuptools.packages.find
, but I do not want to rely on a beta feature! I want to manually specify my project’s structure, I prefer to know what I am doing and to be explicit, especially for something as crucial as packaging, it needs to be very deterministic and future-proof.I have spent a few minutes trying a
src
layout but I can’t get an editablepip
install to work.edit:
Something like this appears to work for me. Where a typical project is structured like:
It’s not really clear to me from the documentation what
setuptools
actually wants. I think where this software fails is in making it clearer what the nominal package structure is. It’s nice that you can kind of do whatever you want and make it work but most of us shipping software out into the wild would rather just conform to something that “just works” and move on. I’m not really seeing that solution emerge out of the discussion or the documentation.@milesgranger you are correct. This is a bug (https://github.com/pypa/setuptools/issues/3260).
The problem is that we cannot resolve this bug without first deprecating and removing the behaviour described in this issue (you can see that there is a lot of people depending on it yet…).
I suppose you can have a workaround by one of the following:
Set
exclude_package_data
to remove all files in the tests folder orSet
include_package_data=False
and addpackage_data
with more specific file patterns.Sorry for the trouble, if we change things right now, several projects in the ecosystem might break (so we have to go through the deprecation period).
@abravalheri I stumbled on a bunch of build warning burried in CI logs that I was reviewing randomly and I am puzzled… can you articulate what end-user benefit do you expect with this change? (e.g. package maintainers that rely on setuptools) ?
Personally I do not think such as warning can be easily seen. My wheels contains thousands of files and the warnings are just drowned in CI log files never looked at unless the build fails. So I am not convinced that this warning would have much effect.
You wrote:
IMHO the current behaviour is the de-facto way that package maintainers understand and have grown to rely on. e.g. when you “include_package_data” anything (file or dir) in the tree of included packages is included.
What is the Python behaviour there beyond the fact that files in the package tree are accessible? I could not find anything about data files or data directories mentioned in PEP 420.
Now, the proposed future behaviour does not seem entirely consistent: when there are data files in a directory with Python code (either a legacy init-style or “namespace” package) these are included but data files in a subdirectory of the same would be not included, e.g., some data files would need an intervention and some data files would not? Unless a subdir of a package dir is not a Python identifier (e.g. with a dash as in “foo-bar”), and then this is included without warning.
So if I understand the to-be behaviour correctly based on deprecation messages this would mean this (assuming in all cases that
include_package_data
is True):__init__.py
) or a declaration such that are treated as namespace packages.I am not sure that this would contribute to a better and consistent user experience.
Hi @mhkline. The warning is not about the files themselves, but about the directory. Right now there is no concept of a “data directory” for the package ecosystem.
Since PEP 420, effectively all directories are packages regardless of containing a
__init__.py
file or not. With this warning, my intention is to align the expectations of the users with the behaviour we observe in Python.If you want the directory to be included in the distribution, you can include it via the
packages=
configuration.find_namespace_packages()
insetup.py
orfind_namespace:
insetup.cfg
will do that for you, and probably make the warning go away.I’m facing the same problem, and I’m not clear on the nature of the change suggested by @abravalheri. Are you saying that directories in the package hierarchy that contain only data files and no Python code should be included in the project’s list of packages despite not actually being Python packages?