pandas: BUG: ValueError: Buffer dtype mismatch, expected 'intp_t' but got 'long long' on ARMv7 32 bit
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
# Create a list of data
data = [["Alice", 25], ["Bob", 30], ["Carol", 35]]
# Create a Pandas DataFrame from the list of data
df = pd.DataFrame(data, columns=["Name", "Age"])
# Print the DataFrame
print(df)
Issue Description
The simplest test on a dataframe object is failing on an embedded 32bit device with ARMv7 CPU. No matter how the dataframe is created, from an array or read from a csv file using read_csv, access on the object fails the same:
File “pandas/_libs/internals.pyx”, line 711, in pandas._libs.internals.BlockManager._rebuild_blknos_and_blklocs ValueError: Buffer dtype mismatch, expected ‘intp_t’ but got ‘long long’
Unfortunately, on that platform I cannot compile a newer version of pandas, so I cannot verify if the issue is present in the latest version or main branch.
Expected Behavior
The expected output is: Name Age 0 Alice 25 1 Bob 30 2 Carol 35
but running the script is giving: File “/home/root/test.py”, line 10, in <module> print(df) File “/usr/lib/python3.10/site-packages/pandas/core/frame.py”, line 1011, in repr return self.to_string(**repr_params) File “/usr/lib/python3.10/site-packages/pandas/core/frame.py”, line 1192, in to_string return fmt.DataFrameRenderer(formatter).to_string( File “/usr/lib/python3.10/site-packages/pandas/io/formats/format.py”, line 1128, in to_string string = string_formatter.to_string() File “/usr/lib/python3.10/site-packages/pandas/io/formats/string.py”, line 25, in to_string text = self._get_string_representation() File “/usr/lib/python3.10/site-packages/pandas/io/formats/string.py”, line 40, in _get_string_representation strcols = self._get_strcols() File “/usr/lib/python3.10/site-packages/pandas/io/formats/string.py”, line 31, in _get_strcols strcols = self.fmt.get_strcols() File “/usr/lib/python3.10/site-packages/pandas/io/formats/format.py”, line 611, in get_strcols strcols = self._get_strcols_without_index() File “/usr/lib/python3.10/site-packages/pandas/io/formats/format.py”, line 864, in _get_strcols_without_index str_columns = self._get_formatted_column_labels(self.tr_frame) File “/usr/lib/python3.10/site-packages/pandas/io/formats/format.py”, line 943, in _get_formatted_column_labels dtypes = self.frame.dtypes File “/usr/lib/python3.10/site-packages/pandas/core/generic.py”, line 5746, in dtypes data = self._mgr.get_dtypes() File “/usr/lib/python3.10/site-packages/pandas/core/internals/managers.py”, line 228, in get_dtypes return dtypes.take(self.blknos) File “/usr/lib/python3.10/site-packages/pandas/core/internals/managers.py”, line 168, in blknos self._rebuild_blknos_and_blklocs() File “pandas/_libs/internals.pyx”, line 711, in pandas._libs.internals.BlockManager._rebuild_blknos_and_blklocs ValueError: Buffer dtype mismatch, expected ‘intp_t’ but got ‘long long’
Installed Versions
commit : 4bfe3d07b4858144c219b9346329027024102ab6 python : 3.10.4.final.0 python-bits : 32 OS : Linux OS-release : 5.15.32-yoctoEMB+gddbcf3882cd3 Version : #1 SMP PREEMPT Thu Apr 13 04:27:17 UTC 2023 machine : armv7l processor : armv7l byteorder : little LC_ALL : None LANG : None LOCALE : None.None
pandas : 1.4.2 numpy : 1.22.3 pytz : 2022.1 dateutil : 2.8.2 pip : 22.0.3 setuptools : 59.5.0 Cython : 0.29.28 pytest : 7.1.1
About this issue
- Original URL
- State: open
- Created 9 months ago
- Comments: 20
The item #1 can easily be addressed if we create a bbappend file with the content: SETUPTOOLS_BUILD_ARGS += “–plat-name ${MACHINE}” This will create the wheel using the machine name instead of x86_64 (i.e. Cython-0.29.28-cp310-cp310-qemuarm.whl). Also the WHEEL tag is properly set. Sure that the ${MACHINE} parameter can be replaced with TUNE_ARCH if we want to have the CPU type in there. I think that this should be part of setuptools3.bbclass as it will be a good practice to have the target machine in the component of the wheel.
Yes, definitely the solution is to adapt the Yocto recipes to compile the right version of these packages. And yes, I am willing to support the effort.
Hi Bogdan, just stumbled over issue labels … it could make sense we tag this issue with following labels:
This might help to get more visible with our issue Anyway I come back with pending verification of WHEEL details sorry takes some more time as I forgot to tell pip not to cleanup build 😉
While I am building the qemu image, I kept looking at the source code. I believe that the the issue is coming from the mismatch definition of the intp_t type in pandas, numpy AND cython, definitely caused by some compiling parameter in Yocto. I guess that in one (or more) recipe the architecture is not properly detected using 64-bit platform instead of 32-bit. Also, checking out the WHEEL in the dist-info folder for the packages (i.e./usr/lib/python3.10/site-packages/pandas-1.4.2.dist-info/WHEEL) , I found that numpy/pandas/cython are having the tag: Tag: cp310-cp310-linux_x86_64 which means that the compatibility platform is x86_64. For the rest of the packages the Tag is py2-none-any or py3-none-any. Can you please check on your unit after pandas was installed with pip command, what that WHEEL file contain?
There are two more tests that we can do to narrow down further the issue. I am planning to run them tomorrow or a day after tomorrow after I’ll build a qemu image. If you have time the tests are quite simple:
We might see that only module compiled by Yocto is causing the problem so we’ll focus on that one.