pandas: BUG: pd.read_pickle not backwards compatible from v2.1.0 to v1.3.4
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# in a pandas 1.3.4 environment
import pandas as pd
import numpy as np
df = pd.DataFrame(np.ones((100, 4)))
df.to_pickle("old_pandas.pkl")
# in pandas 2.1.0 environment
import pandas as pd
df = pd.read_pickle("old_pandas.pkl")
Issue Description
The read_pickle
method raises an error when attempting to read a file serialized with to_pickle
in pandas 1.3.4. No error is raised when using 2.0. to read_pickle
, or if the file is serialized with version >=1.4.
The documentation states the method should be backwards compatible to pandas 0.20.3 provided the object was serialized with to_pickle
(as in this case).
Traceback (most recent call last):
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/io/pickle.py", line 206, in read_pickle
return pickle.load(handles.handle)
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 35, in load_reduce
stack[-1] = func(*args)
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/io/pickle.py", line 211, in read_pickle
return pc.load(handles.handle, encoding=None)
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 225, in load
return up.load()
File "/usr/lib/python3.10/pickle.py", line 1213, in load
dispatch[key[0]](self)
File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 55, in load_reduce
elif args and issubclass(args[0], PeriodArray):
TypeError: issubclass() arg 1 must be a class
Expected Behavior
Method should be backwards compatible, as per the documentation.
Installed Versions
INSTALLED VERSIONS
commit : f00efd0344bd4e22cc867e76c776cb88669e6cde python : 3.10.12.final.0 python-bits : 64 OS : Linux OS-release : 5.15.0-78-generic Version : #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8
pandas : 2.2.0dev0+225.gf00efd0344 numpy : 1.25.2 pytz : 2023.3.post1 dateutil : 2.8.2 setuptools : 59.6.0 pip : 22.0.2 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader : None bs4 : None bottleneck : None dataframe-api-compat: None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2023.3 qtpy : None pyqt5 : None
pandas : 1.3.4 numpy : 1.25.2 pytz : 2023.3.post1 dateutil : 2.8.2 pip : 22.0.2 setuptools : 59.6.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
About this issue
- Original URL
- State: closed
- Created 10 months ago
- Reactions: 4
- Comments: 16 (6 by maintainers)
Commits related to this issue
- PERF: new_block (#52326) * PERF: new_block * fix arraymanager build — committed to pandas-dev/pandas by jbrockmendel a year ago
Is there an update on this issue? I am encountering the same bug
I’m affected, too. Thanks to the comment by tomvannuenen in this issue I tried downgrading pandas to version 2.0.3, which in my case does the trick for now. (Most likely I could have known by looking up the commit that broke it mentioned by aterrel, but anyway…)
Encountered it today.
A fix for this issue should be available when 2.1.4 is released (either end of the week or next week)