pandas: Confusing (possibly buggy) IntervalIndex behavior

In the above, I have a region that I’m querying for with a partially overlapping interval. The query succeeds when the interval is partially overlapping until it doesn’t, throwing the key error:
KeyError Traceback (most recent call last)
/Users/alex/Documents/GarNet/venv/lib/python3.6/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
1433 if not ax.contains(key):
-> 1434 error()
1435 except TypeError as e:
/Users/alex/Documents/GarNet/venv/lib/python3.6/site-packages/pandas/core/indexing.py in error()
1428 raise KeyError("the label [%s] is not in the [%s]" %
-> 1429 (key, self.obj._get_axis_name(axis)))
1430
KeyError: 'the label [(5409951, 5409965]] is not in the [index]'
I think this is particularly confusing because there doesn’t seem to be any prominent difference between the loc
s that succeed and the loc
that fails as far as I can tell. I know we had discussed loc
’s behavior in this context but I’m not sure we came to a conclusion.
By the way, my larger question is about how to find intersections between two IntervalIndex
. It seems like the find_intersections
function didn’t make it into this release @jreback ? Let me know! =]
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 37 (36 by maintainers)
chiming in on this, as we are heavy users of
postgres
range
types andrange
operators as a powerful abstraction for time series dataas already been mentioned, the key verbs are
contains
andoverlaps
both on element and range level and in both directions:examples from the postgres docs:
int4range(2,4) @> int4range(2,3)
'[2011-01-01,2011-03-01)'::tsrange @> '2011-01-10'::timestamp
int4range(2,4) <@ int4range(1,7)
42 <@ int4range(1,7)
int8range(3,7) && int8range(4,12)
int8range(1,10) << int8range(100,110)
numrange(1.1,2.2) -|- numrange(2.2,3.3)
now that we have
Interval
s inpandas
(very grateful for bringing that feature @jreback!) I have already tinkered around with some mappers for going betweenPostgres
andpandas
— maybe that is toodb
-specific but def have a great interest in seeing moreInterval
type functionality in Pandas and helping out with thiscame across this library: https://github.com/AlexandreDecan/python-intervals
looks to have some interesting interval semantics
cc @jschendel
An interval covers another interval if all points in the second interval are found in the first interval.
An interval overlaps another interval if there exist any points found in both intervals.