pandas: REGR: __setitem__ with integer slices on Int/RangeIndex is broken (label instead of positional)
There’s an backward incompatible change in pandas 1.0 that I didn’t find in the changelog. I might have just overlooked it though.
import numpy as np
X = pd.DataFrame(np.zeros((100, 1)))
X[-4:] = 1
X
In pandas 0.25.3 or lower, this results in the last four entries of X to be 1 and all the others zero. In pandas 1.0, it results in all entries of X being 1. I assume it’s a change of indexing axis 0 or axis 1?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (12 by maintainers)
Commits related to this issue
- REG: DataFrame.__setitem__(slice) is positional, closes #31469 — committed to jbrockmendel/pandas by jbrockmendel 4 years ago
This is caused by https://github.com/pandas-dev/pandas/pull/27383 I think (cc @jbrockmendel ), specifically:
This is about positional indexing, so there is no “out of range label”. The
-4
means start from the fourth last element to the end.Again, I agree this is surprising behaviour. You would think it is label-based indexing, but it is not. I already described this 5 years in ago #9595.
Some examples to illustrate this:
This those examples are for
__getitem__
, and work clearly positionally if you look at the index of the results (and both on 0.25 and 1.0, and for both Int64Index as RangeIndex). And so it is__setitem__
is broken in 1.0.0.Thanks for the report.
Seems this doesn’t affect
.iloc
:will look into it
I wonder if it’s related to #31449 but I’m not using a multi-index.