pytest: ModuleNotFoundError when using `--doctest-modules`, a `src/` layout, and `--import-mode=importlib` (and no editable mode install)

reproducer

We’d like to run our tests on an installed package instead of the development version, as we delete some files in our build process. We therefore don’t use editable installs.

I think the problem is that there seems to be no way to configure the import root used with the importlib import mode. The project rootdir gets passed here instead of something user configurable:

https://github.com/pytest-dev/pytest/blob/c34eaaaa1c485e9c8c3d1fe7f05549b6436f49cc/src/_pytest/doctest.py#L545-L549

Which means that the following code will run module_name = module_name_from_path('<rootdir>/src/anndata/_types.py', '<rootdir>'), i.e. module_name = 'src.anndata._types' when it should be 'anndata._types'

https://github.com/pytest-dev/pytest/blob/9f3bdac1500143c51ab358d07eb58b4cfa6d3fdf/src/_pytest/pathlib.py#L524-L525

Without src/ layout, this works accidentally, as the module_name happens to match the real module name, and the module only gets imported once.

Pytest 7.4.2

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Reactions: 1
  • Comments: 29 (27 by maintainers)

Commits related to this issue

Most upvoted comments

Just to provide another data point, I was intrigued by this comment:

The importlib mode was intended to solve avoid changing sys.path/“ImportMismatch” errors, but particularly if I knew all the problems/details it involves, I would just recommend other solutions:

  1. pytest changes sys.path and it is testing the local copy? Switch to src layout.
  2. Test module names with the same? Add __init__.py files in your tests turning them into packages.

For what it’s worth, I hadn’t realized that importlib was no longer being recommended by default. I had been using --import-mode=importlib because I thought it was the recommended mode going forward, but it sounds like that’s no longer the case. It looks like the Good Integration Practices documentation page still (as of 8.0.x) states that:

For new projects, we recommend to use importlib import mode (see which-import-mode for a detailed explanation). To this end, add the following to your pyproject.toml:

Is that no longer the best recommendation, even for new projects?

For what it’s worth, simply removing --import-mode=importlib seems to fix the problem for me, though there was one little weirdness that I thought I’d mention here, related to the pythonpath setting and whether we are or are not using editable installs.

--import-mode pythonpath --doctest-modules editable install non-doctests pass?
prepend (omitted) no no no
prepend (omitted) no yes yes
prepend (omitted) yes no yes
prepend (omitted) yes yes yes
prepend src no no yes
prepend src no yes yes
prepend src yes no yes
prepend src yes yes yes
importlib (omitted) no no no
importlib (omitted) no yes yes
importlib (omitted) yes no no
importlib (omitted) yes yes yes
importlib src no no yes
importlib src no yes yes
importlib src yes no no
importlib src yes yes yes

What I mean by the last column, “non-doctests pass?”, is that when I write a “no”, it means that the non-doctest tests fail with import errors.

Edit: The version of pytest that I am currently using is 7.3.1, so the data in the table reflects that version.

To summarize the table, it looks like editable and non-editable installs behave very differently depending on the values of --import-mode, pythonpath, and presence of --doctest-modules. For some reason, when in prepend mode, if you don’t set pythonpath to src and if you don’t pass in --doctest-modules, then all the non-doctest tests will fail in editable installs, but if pass in --doctest-modules without setting src, then the non-doctest tests will pass.

Does the presence of --doctest-modules cause you to not need to specify pythonpath = "src" for some reason?

I’m asking because I’m managing multiple Python projects, some with doctests and others without any doctests, and I’d like to keep their pytest configurations as close as possible, ideally only passing in --doctest-modules in the projects that actually have doctests written. I found it strange that in projects without doctests, going back to prepend mode caused the non-doctest tests to all fail, but in projects with doctests, they ran fine, and I determined that the --doctest-modules option was affecting how non-doctest tests were being imported.

For my use case, I think I can standardize on prepend mode, explicitly setting pythonpath = src, and optionally adding or omitting --doctest-modules depending on the project, and it seems to work and have the expected behavior for both editable and non-editable installs.

BTW, thanks for the really detailed discussion in this ticket. I’ve been following this ticket mostly as an observer, and it’s been helping me understand how the various settings interact with each other.

I plan to spend more time this week on this, and solve this once and for all (with your fix or something else).

I don‘t think we should even consider sacrificing correctness before having tried out all other sensible avenues.

Oh I agree 100%, I didn’t mean to imply that we were not applying your fix based solely on any performance problems that we have a gut feeling about – I am just thinking there might be some other solution.

One such solution that occurred to me, and I plan to test, is that when using importlib, we first try to import the module normally, without altering sys.path. If the module ends up being importable, then we just use it; if not, then we fallback with the current importlib shenaningans. If I’m right, this would probably be the simplest and a good solution still aligned with importlib’s goal: not changing sys.path and allow duplicated module names.

I’m super grateful that you’re working on getting this resolved.

I’m certain once this bug is fixed, this will make using pytest a blast in all kinds of (actual and not-so-)corner cases.

I’ll test the fix with our projects, which should be complex-enough real-world examples to prove that it works in a lot of cases.

E.g. one easy reproducer is

git clone https://github.com/scverse/scanpy.git
cd scanpy
git checkout 14555ba48537995acaa381b8b6ad5fc41e612510  # current main
python -m venv .venv
.venv/bin/pip install .[test] pytest==8.1.0
.venv/bin/pytest scanpy/_utils/compute/is_constant.py

Argh, really sorry to hear that @flying-sheep. I did test the fix in your reproducible example before merging, but seems there is still something to adjust.

Looking at the code I suspect there might be a bug there indeed, seems we should be checking if module_name already is in sys.modules before attempting to import it here:

https://github.com/pytest-dev/pytest/blob/71849cc05c4fffe2267a6844393be3adb8248820/src/_pytest/pathlib.py#L542-L544

Like it is being done here:

https://github.com/pytest-dev/pytest/blob/71849cc05c4fffe2267a6844393be3adb8248820/src/_pytest/pathlib.py#L550-L552

I will check this tonight.

Sorry again for the breakage. 😕

Proposing a definite solution (IMHO) in #11997. 👍

I’m confused, I thought importlib’s point is that it doesn’t alter sys.path?

Yep, what I mean is just try to import the module, without changing sys.path: in case we are trying to import a normal module which is installed in the virtualenv, this would just work.

Another possibility is to just be able to make PyTest aware of the module root(s) the project has, e.g. src/. Then that would be used for imports instead of the project root directory.

That would be another solution too.

that when using importlib, we first try to import the module normally, without altering sys.path.

I’m confused, I thought importlib’s point is that it doesn’t alter sys.path?

I plan to spend more time this week on this, and solve this once and for all (with your fix or something else).

Another possibility is to just be able to make PyTest aware of the module root(s) the project has, e.g. src/. Then that would be used for imports instead of the project root directory.

I don‘t think it invalidates it in the slightest before someone has measured the impact 😉

Also keep in mind that I outlined some strategies to make it faster.

I don‘t think we should even consider sacrificing correctness before having tried out all other sensible avenues. Importing modules twice can lead to very subtle and hard to debug breakage, so I feel like everyone waiting 5 seconds more for the test suite would be an amazing trade-off compared to a handful of people debugging some weird double-import bug for 3 hours.

I will double check, but @bluetech’s concern is valid: search through sys.path entries can be very costly, specially in environment configurations where modules are reachable by configuring PYTHONPATH – at work we use this, and our sys.path entry will contain dozens of entries.

It does not invalidate it completely of course, but it is something we should consider.

As I said, I will double check it later. 👍

Sorry for the long post, spent the last 3 hours working on this and I’m afraid there’s no simple “fix” for this case.

The short answer is that pytest uses --import-mode=importlib for all imports, not only tests and conftests as originally intended.

The introduction of importlib mode was meant to give an alternative to the other modes so that:

  1. pytest does not need change sys.path to import tests and conftest files.
  2. Different test modules can have the same base name, without __init__.py files.

(Full explanation in the docs).

On a non-src layout:

pkg/__init__.py
tests/
  test_foo.py

One of the reasons why changing sys.path is problematic in this case is because it will add . to the root directory to import the tests under tests, which has the side effect of making pkg also importable. This can hide problems when trying to test the installed package in .venv, because now we are actually testing the local sources as they are reachable through sys.path during testing, instead of the installed packages in .venv.

On src layouts:

src/pkg/__init__.py
tests/
  test_foo.py

This is not a problem: adding . to sys.path does not make pkg importable, so import pkg will import from the .venv as desired.

The reproducer repository is using importlib mode, I’m guessing because of reason 2 (duplicated test names), or perhaps because at one point we recommended using importlib intending to turn it into the default, however since then we realized this is not possible, as it brings it has its own drawbacks.

@flying-sheep, by changing your configuration slightly I was able to make your reproduce run without problems in pytest main and in 7.4.4:

  • Exclude conftest.py files from being installed: they should be imported by pytest directly.
  • Drop importlib, as you are using src layout. If the problem is due to duplicated test names, just adding __init__.py on the folders with the duplicated tests solves it.
λ pytest
======================== test session starts ========================
platform win32 -- Python 3.11.5, pytest-7.4.4, pluggy-1.4.0
rootdir: e:\projects\pytest-doctest-import-mismatch
configfile: pyproject.toml
testpaths: anndata, ./src/anndata/tests
collected 6 items

.env311\Lib\site-packages\anndata\utils.py .                   [ 16%]
.env311\Lib\site-packages\anndata\_core\anndata.py ..          [ 50%]
.env311\Lib\site-packages\anndata\_core\merge.py ..            [ 83%]
src\anndata\tests\test_base.py .                               [100%]

========================= warnings summary ==========================

Here’s the diff:

diff --git a/pyproject.toml b/pyproject.toml
index 81389c5..d15f40a 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -21,12 +21,12 @@ dependencies = [
 
 [tool.hatch.build]
 exclude = [
+    'src/anndata/tests/conftest.py',
     'src/anndata/tests/test_*.py',
 ]
 
 [tool.pytest.ini_options]
 addopts = [
-    '--import-mode=importlib',
     '--doctest-modules',
     '--pyargs',
 ]

Follows is a long description of the overall problem.


The introduction of importlib mode was meant to give an alternative to the other modes so that:

  1. pytest does not need change sys.path to import tests and conftest files.
  2. Different test modules can have the same base name, without __init__.py files.

(Full explanation in the docs).

One important thing in that description is *test modules and conftests.

Before we introduced the importlib mode, all modules directly imported by pytest used the same function, import_path, which changes sys.path to import a module in case it is not already importable. “All modules” in the majority means tests and conftests (as user application code is imported indirectly through these), but it also includes user application code in some cases, doctests being one case relevant here.

The importlib mode works by assuming it is importing a module that cannot be reached from sys.path through the normal means, a common example being the tests directory layout at the root of the package, where we have a test file tests/foo/bar/test_foo.py without any __init__.py files within. It is not importable because tests is not reachable from sys.path, and we also do not want to change it.

To import that file, we:

  1. Try to come up with a “module name” for it based on its full path and root, so for the file tests/foo/bar/test_foo.py we make up an non-existing module name "tests.foo.bar.test_foo". This module name will be unique, and not conflict with other test modules named test_foo.py in the same suite, as they will be in a different directory and will be assigned a different module name using the same mechanism ("tests.schema.test_foo" for example).

  2. Import the module using its full path using importlib.importmodule, and then place that module in sys.modules with the module_name we determined above.

This works well when used to import just test modules and conftests, but will create divergences when used import application code. In the reproducer repo, we are trying to import anndata using importlib mode, so it ends up being imported as a module named using the logic described above, however it is reachable through sys.path and we probably should import it using the normal mechanisms.

Probably one solution is to revisit all places where we manually import modules, and possibly use “importlib” (when configured) for tests+conftests or fallback to the other mechanisms for normal modules, but this feels extremely risky given the millions of test suites out there.

The importlib mode was intended to solve avoid changing sys.path/“ImportMismatch” errors, but particularly if I knew all the problems/details it involves, I would just recommend other solutions:

  1. pytest changes sys.path and it is testing the local copy? Switch to src layout.
  2. Test module names with the same? Add __init__.py files in your tests turning them into packages.

But here we are.

I’m not sure how to proceed here. Perhaps we can right away add a warning on --doctest-modules that it might be problematic to use together with --import-mode=importlib.

I will spend some more time trying to see if there’s a simple and safe solution to this general mess.

Sorry, I muddled my project (https://github.com/scverse/anndata/pull/1151) too much with this issue. I’m trying to be clear now.

Adding --import-mode=importlib to the above results in this:

E   ModuleNotFoundError: No module named
  'home.phil..local.share.hatch.[…].site-packages.anndata._core';
  'home.phil..local.share.hatch.[…].site-packages.anndata' is not a package

What needs to work for this issue to be considered fixed:

+ src/my_module/__init__.py  # containing a doctest
+ tests/test_basic.py
[build-system]
build-backend = "hatchling.build"
requires = ["hatchling", "hatch-vcs"]

[project]
name = "my_module"
version = "0.1.0"

[tool.pytest.ini_options]
addopts = [
    "--import-mode=importlib",
    "--strict-markers",
    "--doctest-modules",
    "--pyargs",
]
testpaths = [
    "my_module",  # or without `--pyargs`: "./src/my_module"
    "./tests",
]
What would be nice if it worked for migrating our project:
+ src/my_module/
  + __init__.py  # containing a doctest
  + tests/
    + conftest.py  # containing `collect_ignore = ["helpers.py"]`
    + helpers.py
    + test_basic.py
...

[tool.pytest.ini_options]
addopts = [
    "--import-mode=importlib",
    "--strict-markers",
    "--doctest-modules",
    "--pyargs",
]
testpaths = [
    "my_module",
    "./src/my_module/tests",
]

I just wanted to add that we are hitting this issue as well, but we are using editable installs, so it looks like this issue affects both the case with editable and non-editable installs of the project source files.

Just to be clear, our project is:

  • an editable installation
  • has a src/ layout
  • puts non-doctest tests outside the application code, in a tests/ directory that is a sibling to the src/ directory
  • uses --import-mode=importlib
  • (attempts to) use --doctest-modules

We more or less exactly follow the suggested layout and settings in https://docs.pytest.org/en/7.1.x/explanation/goodpractices.html#tests-outside-application-code, and everything was working fine until attempting to run doctests within the application code using --doctest-modules.