pandas: CI: py 3.10 build failing

@seberg this build is using numpy 1.22dev, looks like a bunch of the failures are raising in np.iinfo(np.int64).max

    return np.iinfo(np.int64).max
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <[AttributeError("'iinfo' object has no attribute 'kind'") raised in repr()] iinfo object at 0x7f986b9aa830>
int_type = <class 'numpy.int64'>

    def __init__(self, int_type):
        try:
            self.dtype = numeric.dtype(int_type)
        except TypeError:
>           self.dtype = numeric.dtype(type(int_type))
E           TypeError: 'numpy.dtype[bool_]' object is not callable

/opt/hostedtoolcache/Python/3.10.0-beta.2/x64/lib/python3.10/site-packages/numpy/core/getlimits.py:518: TypeError

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 4
  • Comments: 24 (17 by maintainers)

Commits related to this issue

Most upvoted comments

Aha, I think I have a lead… To cut it down more, this line is sufficient to trigger the issue:

pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object))

And that should end up side-stepping almost all code in maybe_convert_objects.

Not feeling like getting pandas dev setup running right now, but there is one line here:

mask = np.full(n, False)

which is called from cython using:

  /* "pandas/_libs/lib.pyx":2441
 *     uints = np.empty(n, dtype='u8')
 *     bools = np.empty(n, dtype=np.uint8)
 *     mask = np.full(n, False)             # <<<<<<<<<<<<<<
 * 
 *     if convert_datetime:
 */
  __Pyx_GetModuleGlobalName(__pyx_t_6, __pyx_n_s_np); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 2441, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_5 = __Pyx_PyObject_GetAttrStr(__pyx_t_6, __pyx_n_s_full); if (unlikely(!__pyx_t_5)) __PYX_ERR(0, 2441, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_5);
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_t_6 = PyInt_FromSsize_t(__pyx_v_n); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 2441, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_2 = NULL;
  __pyx_t_8 = 0;
  if (CYTHON_UNPACK_METHODS && unlikely(PyMethod_Check(__pyx_t_5))) {
    __pyx_t_2 = PyMethod_GET_SELF(__pyx_t_5);
    if (likely(__pyx_t_2)) {
      PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_5);
      __Pyx_INCREF(__pyx_t_2);
      __Pyx_INCREF(function);
      __Pyx_DECREF_SET(__pyx_t_5, function);
      __pyx_t_8 = 1;
    }
  }
  #if CYTHON_FAST_PYCALL
  if (PyFunction_Check(__pyx_t_5)) {
    PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
    __pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
    __Pyx_GOTREF(__pyx_t_15);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  } else
  #endif
  #if CYTHON_FAST_PYCCALL
  if (__Pyx_PyFastCFunction_Check(__pyx_t_5)) {
    PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
    __pyx_t_15 = __Pyx_PyCFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
    __Pyx_GOTREF(__pyx_t_15);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  } else
  #endif
  {
    __pyx_t_1 = PyTuple_New(2+__pyx_t_8); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_1);
    if (__pyx_t_2) {
      __Pyx_GIVEREF(__pyx_t_2); PyTuple_SET_ITEM(__pyx_t_1, 0, __pyx_t_2); __pyx_t_2 = NULL;
    }
    __Pyx_GIVEREF(__pyx_t_6);
    PyTuple_SET_ITEM(__pyx_t_1, 0+__pyx_t_8, __pyx_t_6);
    __Pyx_INCREF(Py_False);
    __Pyx_GIVEREF(Py_False);
    PyTuple_SET_ITEM(__pyx_t_1, 1+__pyx_t_8, Py_False);
    __pyx_t_6 = 0;
    __pyx_t_15 = __Pyx_PyObject_Call(__pyx_t_5, __pyx_t_1, NULL); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_15);
    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  }
  __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
  __pyx_v_mask = __pyx_t_15;
  __pyx_t_15 = 0;

Now, that may be nothing, but np.full lives in the numeric module. And it does use a dtype which is the boolean dtype we end up with here. Obviously, that also should not mess with the module scope, but at least the np.core.numeric module gets involved there.

EDIT: Continuing down the rabbit hole a bit. In fact the value is mutated by the time the trace function says that np.full is called (or by the time tracing reports it). No call before it seems to happen at all (np.empty, etc. are all C implemented though, so maybe that is why).

EDIT2: I opened a Python issue here: https://bugs.python.org/issue46451

I’ve run into this problem while trying to debug (in PyCharm) some code that uses pandas 1.3.5, and I was able to create a minimal reproducible example:

import sys
import numpy as np
import pandas as pd

from numpy.core import numeric

def trace(frame, event, arg):
    return trace


sys.settrace(trace)  # This call isn't necessary when debugging.

arrays = [np.array([1, 2]), np.array([3, 4])]
index = pd.MultiIndex.from_arrays(arrays, names=["iA", "iB"])

dtype_class = numeric.dtype
print(f"Before DataFrame:\n  {numeric.dtype=}\n  {type(numeric.dtype)=}")

a = pd.DataFrame(
    data={"C1": np.array([10.0, 20.0]), "C2": np.array([30.0, 40.0])},
    index=index,
)

# This import fails:
# import scipy.linalg.lapack

# But this check is simpler:
print(f"After DataFrame:\n  {numeric.dtype=}\n  {type(numeric.dtype)=}")
assert numeric.dtype is dtype_class

Note that pandas is changing the value of numpy.core.numeric.dtype, which originally is a class:

Before DataFrame:
  numeric.dtype=<class 'numpy.dtype'>
  type(numeric.dtype)=<class 'numpy._DTypeMeta'>
After DataFrame:
  numeric.dtype=dtype('bool')
  type(numeric.dtype)=<class 'numpy.dtype[bool_]'>

If we comment out sys.settrace(trace) and debug the code, the output is a little bit different:

Before DataFrame:
  numeric.dtype=<class 'numpy.dtype'>
  type(numeric.dtype)=<class 'numpy._DTypeMeta'>
After DataFrame:
  numeric.dtype=None
  type(numeric.dtype)=<class 'NoneType'>

If we uncomment # import scipy.linalg.lapack, the output is a little bit more complex (the first error I got, and similar to the error in the original report above, and also reported in this question in StackOverflow):

Traceback (most recent call last):
  File "C:\dev\bug\.venv\lib\site-packages\numpy\core\getlimits.py", line 649, in __init__
    self.dtype = numeric.dtype(int_type)
TypeError: 'NoneType' object is not callable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "C:\dev\bug\.venv\lib\site-packages\scipy\linalg\__init__.py", line 195, in <module>
    from .misc import *
  File "C:\dev\bug\.venv\lib\site-packages\scipy\linalg\misc.py", line 4, in <module>
    from .lapack import get_lapack_funcs
  File "C:\dev\bug\.venv\lib\site-packages\scipy\linalg\lapack.py", line 990, in <module>
    _int32_max = _np.iinfo(_np.int32).max
  File "C:\dev\bug\.venv\lib\site-packages\numpy\core\getlimits.py", line 651, in __init__
    self.dtype = numeric.dtype(type(int_type))
TypeError: 'NoneType' object is not callable

Using a custom trace function, I’ve pinpointed that the global dtype is changed right after this call: https://github.com/pandas-dev/pandas/blob/v1.3.5/pandas/core/indexes/base.py#L6411

This call goes into Cython code and, looking at it, I’ve found this suspicious assignment that may be the cause (but I’m not sure, as there’s also the weird issue that this only happens when there’s a tracer or a profiler): https://github.com/pandas-dev/pandas/blob/v1.3.5/pandas/_libs/lib.pyx#L2655 (also another one here: https://github.com/pandas-dev/pandas/blob/v1.3.5/pandas/_libs/lib.pyx#L1429).

(Also, not sure why would an assignment like that change a global in a numpy module, bug in Cython perhaps?)

It doesn’t matter whether you have cython installed. It matters which cython was used for building pandas, scikit-learn, …

All of these packages need to update slowly so that you can avoid installing your own cython but still get the fix.

Using a custom trace function, I’ve pinpointed that the global dtype is changed right after this call: https://github.com/pandas-dev/pandas/blob/v1.3.5/pandas/core/indexes/base.py#L6411

Great debugging there! Still utterly puzzling 😃. Just to note, I can reproduce the example in python3.10.1, but not python3.9.0 (Maybe we knew that long ago). Further, it does not matter whether I run python compiled for debugging. valgrind does not find anything (not that I would have expected that).

So we know that this is sensitive to python3.10 and has to do with tracing being active? We also know it is probably related to Cython. And I feel I have heard about tricky changes in Python 3.10 that affected cython? It feels like it is probably time to open either a python or cython issue about this?

will be in the 1.4.3 release. discussion on release date in #46610

building packages with newer cython seems to fix this issue, right? I’d like to check whether it also fixes my issue. If it doesn’t I guess I have another issue. Are there builds around with the newer cython?

The important part is whether tracing is enabled (i.e. typically a debugger or profiler is being used). In that case you will run into this issue. Check also https://github.com/cython/cython/issues/4609

Basically, your options are to upgrade Cython (to the non-released version as of now), to use the Cython 3 alpha, or to use the correct compile time option to disable the faulty paths.

Mark Shannon asked for a repro, and I had another look and it seems like Cython generates somewhat complicated stuff (PyEval...). So moved it to cython/cython#4609, on the plus side, there is really nothing fancy about it and you can trivially reproduce this without pandas/numpy and just cython. (I still have no idea if it is Cython or Python going wrong.)

@jbrockmendel I think I’ve figured it out. So it turns out that sys.setprofile, which is called in our tests for read_csv, is somehow changing the value of np.core.numeric.dtype. In #43910, where I skip this test, the Python 3.10 tests all pass.

One explanation might be that we are not resetting sys.setprofile back correctly, but the sys.setprofile(None) call should be the correct way to reset it back.

I will continue looking into this.

cc @mzeitlin11