pandas: Intermittent error fetching value from multi-indexed dataframe
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
-
See this question and this question.
Question about pandas
I have a dataframe with a multiindex from which I am attempting access a row from. However, it is seemingly failing stochastically on around 1/10th of runs. I see this behaviour both locally and on prod. The dataframe can be recreated with the following:
from datetime import timedelta, date
import pandas as pd
import pytz
from pandas import Timestamp
utc = pytz.UTC
data = {
"date": [
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).replace(minute=59, second=59, microsecond=999999),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date(),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date(),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=1),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=1),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=2),
Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=2),
],
"a": ["alpha", "alpha", "beta", "alpha", "beta", "alpha", "beta"],
"b": [100, 100, 100, 100, 100, 100, 100],
"c": [100, 100, 100, 100, 100, 100, 100],
"d": [0, 0, 0, 0, 0, 0, 0],
"e": [100, 100, 100, 100, 100, 100, 100],
"f": [0, 0, 0, 0, 0, 0, 0],
"g": [0, 0, 0, 0, 0, 0, 0],
"h": ["A", "B", "C", "D", "E", "F", "G"],
}
df = pd.DataFrame(data)
breakdown = df.groupby(["date", "a"]).sum()
done = breakdown.loc[date(2020, 6, 3), "beta"]
I do not know if it is my incorrect usage that is causing this behaviour or a bug.
I originally encountered the issue on pandas
1.1.4
(*) with the error:
TypeError: '<' not supported between instances of 'int' and 'slice'
and on 1.2.1
(**) I see the same intermitent errors but with the error message:
KeyError: 'beta'
Version info
(*)
INSTALLED VERSIONS
------------------
commit : 9d598a5e1eee26df95b3910e3f2934890d062caa
python : 3.9.0.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.2.1
numpy : 1.19.5
pytz : 2020.5
dateutil : 2.8.1
pip : 21.0
setuptools : 49.6.0.post20210108
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
(**)
INSTALLED VERSIONS
------------------
commit : 67a3d4241ab84419856b84fc3ebc9abcbe66c6b3
python : 3.9.0.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.1.4
numpy : 1.20.0
pytz : 2021.1
dateutil : 2.8.1
pip : 20.3.1
setuptools : 49.6.0.post20201009
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.20.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.6.0
sqlalchemy : 1.3.17
tables : None
tabulate : 0.8.7
xarray : None
xlrd : None
xlwt : None
numba : None
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 40 (33 by maintainers)
gotcha, im on the totally wrong track. thanks