pandas: BUG: read_excel not accepting encoding on 1.1.0

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here
def date_parse(value: str) -> datetime.datetime:
    return datetime.datetime.strptime(value.strip(), "%d/%m/%y")

df = pd.read_excel(
    filepath,
    encoding="iso-8859-1",
    usecols="B",
    date_parser=date_parse,
    skiprows=4,
)

Problem description

TypeError: read_excel() got an unexpected keyword argument ‘encoding’.

Expected Output

No error, as in previous versions.

Output of pd.show_versions()

------------------
commit           : d9fff2792bf16178d4e450fe7384244e50635733
python           : 3.8.2.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.4.0-42-generic
Version          : #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : pt_BR.UTF-8

pandas           : 1.1.0
numpy            : 1.18.5
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1
setuptools       : 41.2.0
Cython           : None
pytest           : 6.0.1
hypothesis       : None
sphinx           : 3.2.0
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : None
pandas_datareader: None
bs4              : None
bottleneck       : None
fsspec           : None
fastparquet      : None
gcsfs            : None
matplotlib       : None
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pyxlsb           : None
s3fs             : None
scipy            : None
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : 1.2.0
xlwt             : None
numba            : None```

</details>

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 45 (31 by maintainers)

Commits related to this issue

Most upvoted comments

The encoding argument is very important. I want to come back “encoding=”.

Excel application windows edition’s default encoding ShiftJIS(cp932). The export file(.xlsx) causes character corruption in dealing with pandas in MacOS because the os default encoding UTF8.

I am not sure whether the developers see these comments since this issue is already closed. The encoding is very important for the program to interpret special characters. So, I would be pleased if there would be a solution to this problem (either the encoding or any other alternatives should be fine for me) Thank you

I have language accents in my excel, when reading from csv I can pass encoding=‘utf-8’. Why would Pandas remove this ability from reading excel files? This is a blocker for me now, all of my accented words are getting garbled when read in from excel. The after my wrangling I am writing to csv and get UnicodeEncodeError: ‘charmap’ codec can’t encode character ‘\U0001f3c0’ in position 69: character maps to <undefined> What is the workaround that allowed you to remove this functionality?

Note my excel is a .xlsx export from Google sheets.

I am disappointed nobody responded to Matthew over 2 months ago, what hope is there for me? Since this is closed should I open a new issue? I will wait a couple of hours.

Closing as a non-issue. Providing encoding as a keyword argument is unnecessary and only “worked” before because we silently accepted kwargs and discard it; it never had any functionality and should be removed from code

The workaround suggested here seems to work.

wb = xlrd.open_workbook(path, encoding_override='CORRECT_ENCODING')
df = pd.read_excel(wb)

encoding parameter is needed !!! Year 2023 updated!