pyfakefs: `pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_read_{csv,table}` fail if zstandard is used with CFFI backend, on Python 3.12

Describe the bug When the zstandard package is using the CFFI backend, the two following tests fail on Python 3.12:

========================================================= test session starts =========================================================
platform linux -- Python 3.12.0, pytest-7.4.3, pluggy-1.3.0
rootdir: /tmp/pyfakefs
plugins: pyfakefs-5.4.dev0
collected 4 items                                                                                                                     

pyfakefs/tests/patched_packages_test.py F.F.                                                                                    [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestPatchedPackages.test_read_csv __________________________________________________

self = <pyfakefs.tests.patched_packages_test.TestPatchedPackages testMethod=test_read_csv>

    def test_read_csv(self):
        path = "/foo/bar.csv"
        self.fs.create_file(path, contents="1,2,3,4")
>       df = pd.read_csv(path)

pyfakefs/tests/patched_packages_test.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:948: in read_csv
    return _read(filepath_or_buffer, kwds)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:611: in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1448: in __init__
    self._engine = self._make_engine(f, self.engine)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1705: in _make_engine
    self.handles = get_handle(
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:709: in get_handle
    if _is_binary_mode(path_or_buf, mode) and "b" not in mode:
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:1171: in _is_binary_mode
    return isinstance(handle, _get_binary_io_classes()) or "b" in getattr(
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:1186: in _get_binary_io_classes
    zstd = import_optional_dependency("zstandard")
.tox/py312/lib/python3.12/site-packages/pandas/compat/_optional.py:132: in import_optional_dependency
    module = importlib.import_module(name)
/usr/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1381: in _gcd_import
    ???
<frozen importlib._bootstrap>:1354: in _find_and_load
    ???
<frozen importlib._bootstrap>:1325: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:929: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:994: in exec_module
    ???
<frozen importlib._bootstrap>:488: in _call_with_frames_removed
    ???
.tox/py312/lib/python3.12/site-packages/zstandard/__init__.py:73: in <module>
    from .backend_cffi import *
.tox/py312/lib/python3.12/site-packages/zstandard/backend_cffi.py:89: in <module>
    from ._cffi import (  # type: ignore
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pyfakefs.fake_io.FakeIoModule object at 0x7feba17ec7d0>, name = '_IOBase'

    def __getattr__(self, name):
        """Forwards any unfaked calls to the standard io module."""
>       return getattr(self._io_module, name)
E       AttributeError: module 'io' has no attribute '_IOBase'. Did you mean: 'IOBase'?

pyfakefs/fake_io.py:150: AttributeError
_________________________________________________ TestPatchedPackages.test_read_table _________________________________________________

self = <pyfakefs.tests.patched_packages_test.TestPatchedPackages testMethod=test_read_table>

    def test_read_table(self):
        path = "/foo/bar.csv"
        self.fs.create_file(path, contents="1|2|3|4")
>       df = pd.read_table(path, delimiter="|")

pyfakefs/tests/patched_packages_test.py:58: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1282: in read_table
    return _read(filepath_or_buffer, kwds)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:611: in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1448: in __init__
    self._engine = self._make_engine(f, self.engine)
.tox/py312/lib/python3.12/site-packages/pandas/io/parsers/readers.py:1705: in _make_engine
    self.handles = get_handle(
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:709: in get_handle
    if _is_binary_mode(path_or_buf, mode) and "b" not in mode:
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:1171: in _is_binary_mode
    return isinstance(handle, _get_binary_io_classes()) or "b" in getattr(
.tox/py312/lib/python3.12/site-packages/pandas/io/common.py:1186: in _get_binary_io_classes
    zstd = import_optional_dependency("zstandard")
.tox/py312/lib/python3.12/site-packages/pandas/compat/_optional.py:132: in import_optional_dependency
    module = importlib.import_module(name)
/usr/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1381: in _gcd_import
    ???
<frozen importlib._bootstrap>:1354: in _find_and_load
    ???
<frozen importlib._bootstrap>:1325: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:929: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:994: in exec_module
    ???
<frozen importlib._bootstrap>:488: in _call_with_frames_removed
    ???
.tox/py312/lib/python3.12/site-packages/zstandard/__init__.py:73: in <module>
    from .backend_cffi import *
.tox/py312/lib/python3.12/site-packages/zstandard/backend_cffi.py:89: in <module>
    from ._cffi import (  # type: ignore
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pyfakefs.fake_io.FakeIoModule object at 0x7feba16eeed0>, name = '_IOBase'

    def __getattr__(self, name):
        """Forwards any unfaked calls to the standard io module."""
>       return getattr(self._io_module, name)
E       AttributeError: module 'io' has no attribute '_IOBase'. Did you mean: 'IOBase'?

pyfakefs/fake_io.py:150: AttributeError
========================================================== warnings summary ===========================================================
.tox/py312/lib/python3.12/site-packages/dateutil/tz/tz.py:37
  /tmp/pyfakefs/.tox/py312/lib/python3.12/site-packages/dateutil/tz/tz.py:37: DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC).
    EPOCH = datetime.datetime.utcfromtimestamp(0)

pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_read_excel
pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_read_excel
pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_write_excel
pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_write_excel
pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_write_excel
  /tmp/pyfakefs/.tox/py312/lib/python3.12/site-packages/openpyxl/packaging/core.py:99: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
    now = datetime.datetime.utcnow()

pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_write_excel
  /tmp/pyfakefs/.tox/py312/lib/python3.12/site-packages/openpyxl/writer/excel.py:292: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
    workbook.properties.modified = datetime.datetime.utcnow()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================= short test summary info =======================================================
FAILED pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_read_csv - AttributeError: module 'io' has no attribute '_IOBase'. Did you mean: 'IOBase'?
FAILED pyfakefs/tests/patched_packages_test.py::TestPatchedPackages::test_read_table - AttributeError: module 'io' has no attribute '_IOBase'. Did you mean: 'IOBase'?
=============================================== 2 failed, 2 passed, 7 warnings in 1.58s ===============================================

How To Reproduce

In a Python 3.12 venv:

pip install pytest pandas zstandard cffi
export PYTHON_ZSTANDARD_IMPORT_POLICY=cffi
pytest pyfakefs/tests/patched_packages_test.py::TestPatchedPackages

Your environment

$ python -c "import platform; print(platform.platform())"
Linux-6.6.1-gentoo-dist-x86_64-AMD_Ryzen_5_3600_6-Core_Processor-with-glibc2.38
$ python -c "import sys; print('Python', sys.version)"
Python 3.12.0 (main, Oct 25 2023, 07:20:32) [GCC 13.2.1 20231014]
$ python -c "from pyfakefs import __version__; print('pyfakefs', __version__)"
pyfakefs 5.4.dev0
$ python -c "import pytest; print('pytest', pytest.__version__)"
pytest 7.4.3

Reproduced on 8c7a99c8cc22d63dd33b00cfc17da6472c6e7a27.

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 21 (18 by maintainers)

Commits related to this issue

Most upvoted comments

@mrbean-bremen Unfortunately I do not really have much more details. It is possible that some of the files we read are somewhat large, we do have a folder with under 800kB of yaml data loaded by some legacy code we know to be scaling inefficiently, but it hardly explains the increase in memory usage we have noticed. If we do get more specific details we’ll make sure to post them here.

I do not know if it is related but our internal CI is getting OOM killed apparently on a unittest that uses pyfakefs. The memory goes from 1.6GB usage with python 3.11.6 and pyfakefs 5.3.1 to 2.5GB (and OOM killed) with python 3.12.0 and pyfakefs 5.3.2. The container runs on EKS and is based off Ubuntu 22.04.

I’m testing various variations to identify the exact test that triggers the memory leak and see if the problem is with the pyfakefs version.

Thanks for the ping! I’ve already added the new version to Gentoo, and ofc I forgot to reenable the test 😉.