pytest-xdist: pytest hangs indefinitely after completing tests in parallel

I run into this behavior when running web tests with Google App Engine. However, since this issue only occurs when using xdist, I do not believe it is caused by App Engine itself.

Run the following two tests in parallel. Tests will complete, but pytest will not finish:

def test_REMOVE():
    port = 9999
    admin_port = 9998
    cmd = '"{python}" "{gaepath}/dev_appserver.py" "{apppath}"' \
          ' -A test-{port} --port={port} --admin_port={adminport}' \
          ' --datastore_path=C:/tmp/test_datastore_{port}' \
          ' --clear_datastore=yes'.format(python=sys.executable,
                                          gaepath=GAE_PATH,
                                          apppath=HELLO_PATH,
                                          port=port, adminport=admin_port)
    app_engine = subprocess.Popen(cmd)

    time.sleep(8)

    app_engine.terminate()

def test_REMOVE_2():
    port = 10001
    admin_port = 10000
    cmd = '"{python}" "{gaepath}/dev_appserver.py" "{apppath}"' \
          ' -A test-{port} --port={port} --admin_port={adminport}' \
          ' --datastore_path=C:/tmp/test_datastore_{port}' \
          ' --clear_datastore=yes'.format(python=sys.executable,
                                          gaepath=GAE_PATH,
                                          apppath=HELLO_PATH,
                                          port=port, adminport=admin_port)
    app_engine = subprocess.Popen(cmd)

    time.sleep(8)

    app_engine.terminate()

I ran the tests with the following line:

py.test -k REMOVE -n 2

The result is:

============================= test session starts =============================
platform win32 -- Python 2.7.11, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
rootdir: ..., inifile:
plugins: cov-2.2.1, xdist-1.14
gw0 [2] / gw1 [2]
scheduling tests via LoadScheduling
..

Killing the process results in the following traceback:

Traceback (most recent call last):
  File "c:\python27\lib\runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "c:\python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "C:\Python27\Scripts\py.test.exe\__main__.py", line 9, in <module>
  File "c:\python27\lib\site-packages\_pytest\config.py", line 49, in main
    return config.hook.pytest_cmdline_main(config=config)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 724, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 338, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 333, in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 596, in execute
    res = hook_impl.function(*args)
  File "c:\python27\lib\site-packages\_pytest\main.py", line 119, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "c:\python27\lib\site-packages\_pytest\main.py", line 114, in wrap_session
    exitstatus=session.exitstatus)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 724, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 338, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 333, in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 595, in execute
    return _wrapped_call(hook_impl.function(*args), self.execute)
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 249, in _wrapped_call
    wrap_controller.send(call_outcome)
  File "c:\python27\lib\site-packages\_pytest\terminal.py", line 363, in pytest_sessionfinish
    outcome.get_result()
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 279, in get_result
    _reraise(*ex)  # noqa
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 264, in __init__
    self.result = func()
  File "c:\python27\lib\site-packages\_pytest\vendored_packages\pluggy.py", line 596, in execute
    res = hook_impl.function(*args)
  File "c:\python27\lib\site-packages\xdist\dsession.py", line 517, in pytest_sessionfinish
    nm.teardown_nodes()
  File "c:\python27\lib\site-packages\xdist\slavemanage.py", line 62, in teardown_nodes
    self.group.terminate(self.EXIT_TIMEOUT)
  File "c:\python27\lib\site-packages\execnet\multi.py", line 208, in terminate
    for gw in self._gateways_to_join])
  File "c:\python27\lib\site-packages\execnet\multi.py", line 297, in safe_terminate
    workerpool.waitall()
  File "c:\python27\lib\site-packages\execnet\gateway_base.py", line 325, in waitall
    return my_waitall_event.wait(timeout=timeout)
  File "c:\python27\lib\threading.py", line 614, in wait
    self.__cond.wait(timeout)
  File "c:\python27\lib\threading.py", line 340, in wait
    waiter.acquire()
KeyboardInterrupt

Finally, I am on windows.

Note that this test is run with the hello world application given by Google. Any other app engine application I’ve tested causes the same hang.

Also, I have tried a Popen that simply runs python, and this does not result in the same hang.

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 2
  • Comments: 27 (8 by maintainers)

Most upvoted comments

also having this problem, does anyone fixed it?

🔥 Solved here (SQLAlchemy Session and Sessionmaker + Pytest Fixture + Coverage interaction) 🔥

Here’s the backstory:

The host system is a FastAPI application and we started to have the same issue with our test suite when using pytest + coverage + pytest-cov. Our lockups were tied to tests which directly needed a SQLAlchemy Session object (from a test fixture), for things like direct manipulation of database objects, instead of going thru the FastAPI calls.

We have a generalized way of grabbing a Session object, from a global Sessionmaker which is also tied to a global Engine

session_factory: Callable[[], Session] = sessionmaker(
    autocommit=False, autoflush=True, bind=dors_engine
)

def get_session() -> Iterator[Session]:
    db_session: Session = session_factory()

    try:
        yield db_session
    finally:
        db_session.close()

And this is the test fixture we had:

@pytest.fixture
def db_session(app_config) -> Session:
    yield from get_session()

The db_session.close() call above does its job fine during normal (API) workloads, but for some reason, the test fixture was hanging after the third or so time it tried to acquire a database session during those db-oriented tests (and only if we were using coverage or pytest-cov.

We changed the fixture to include a new-ish SQLAlchemy 1.4 utility function (that says in its docstring that “might be useful for test scenarios” - so I guess there’s something to that) that closes all outstanding sessions it has a weakref to.

@pytest.fixture
def db_session(app_config) -> Session:
    yield from get_session()

    ##########################
    # This needs to be here because tests hang when run with "coverage" or "pytest --cov"
    # The problem does not arise when using a simple `pytest` invocation
    # Something in the way coverage's hooks work might be allowing database sessions
    # to become dangling/lost in context (and never freed)
    ##########################
    from sqlalchemy.orm import close_all_sessions
    close_all_sessions()

And just like that, our test suite was back up and running, both locally and on GitHub Actions (No need to downgrade or play around with pytest / coverage versions or anything like that).

Hope this helps someone else!

I am seeing the same issue on linux. Is there any fix/workaround? I am using pytest version 3.1.0.

If it can help someone, I had the same symptom with pytest hanging

The root cause was that postgresql requires to close all sessions, otherwise there are still open locks

see: https://docs.sqlalchemy.org/en/14/faq/metadata_schema.html#my-program-is-hanging-when-i-say-table-drop-metadata-drop-all

I’m having this same problem in a different context. I’m running pytest-django tests locally with xdist, and it hangs in the same way. Based on the conversation here, I tried running them with the --capture=no option, but still had the problem.

Any suggestions on how to debug & work around this in my case?

Hi, I’m also facing same issue but it occures only when tests are in classes. is there any workaround ?

I’ve experienced this as well and had to switch to --capture=no as a default in my pytest.ini, then teammates can re-enable capture if they need to.

Also: I noticed the issue less when I was using pytest-sugar plugin, I wonder if somehow that affected the capturing in a way that was less prone to this issue.