pandas: BUG: ImportError: Missing optional dependency 'openpyxl'. Use pip or conda to install openpyxl.

I am running the following code from Jupyter notebook.

dataDir =  r'D:\\'
files = glob(os.path.join(dataDir, '*.xlsx'))
print(files)
if os.path.isfile(files[0]):
    print('ok')
df = pd.read_excel(files[0], engine='openpyxl')

which prints: ['D:\\\\file_index_all.xlsx', 'D:\\\\file_index_all2.xlsx'] ok

But I get the following error.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-52-5dc0ef3ce47b> in <module>
      6 if os.path.isfile(files[0]):
      7     print('ok')
----> 8 df = pd.read_excel(files[0], engine='openpyxl')

c:\users\ranab\miniconda3\lib\site-packages\pandas\io\excel\_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    302 
    303     if not isinstance(io, ExcelFile):
--> 304         io = ExcelFile(io, engine=engine)
    305     elif engine and engine != io.engine:
    306         raise ValueError(

c:\users\ranab\miniconda3\lib\site-packages\pandas\io\excel\_base.py in __init__(self, io, engine)
    819         self._io = stringify_path(io)
    820 
--> 821         self._reader = self._engines[engine](self._io)
    822 
    823     def __fspath__(self):

c:\users\ranab\miniconda3\lib\site-packages\pandas\io\excel\_openpyxl.py in __init__(self, filepath_or_buffer)
    482             Object to be parsed.
    483         """
--> 484         import_optional_dependency("openpyxl")
    485         super().__init__(filepath_or_buffer)
    486 

c:\users\ranab\miniconda3\lib\site-packages\pandas\compat\_optional.py in import_optional_dependency(name, extra, raise_on_missing, on_version)
     90     except ImportError:
     91         if raise_on_missing:
---> 92             raise ImportError(msg) from None
     93         else:
     94             return None

ImportError: Missing optional dependency 'openpyxl'.  Use pip or conda to install openpyxl.

pandas version: pd.__version__
1.0.1 Also one other thing I noticed despite having openpyxl 3.0.7 I can’t import it in jupyter notebook but in pycharm

import openpyxl
print(openpyxl.__version__)

shows 3.0.7

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18 (8 by maintainers)

Most upvoted comments

This is the pd.show_versions() from pycharm:

INSTALLED VERSIONS
------------------
commit           : b5958ee1999e9aead1938c0bba2b674378807b3d
python           : 3.6.13.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
Version          : 10.0.17763
machine          : AMD64
processor        : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : None.None

pandas           : 1.1.5
numpy            : 1.19.2
pytz             : 2021.1
dateutil         : 2.8.1
pip              : 21.1.1
setuptools       : 52.0.0.post20210125
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : None
IPython          : None
pandas_datareader: None
bs4              : None
bottleneck       : None
fsspec           : None
fastparquet      : None
gcsfs            : None
matplotlib       : None
numexpr          : None
odfpy            : None
openpyxl         : 3.0.7
pandas_gbq       : None
pyarrow          : None
pytables         : None
pyxlsb           : None
s3fs             : None
scipy            : None
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : 1.0.0
xlwt             : None
numba            : None
None

It has openpyxl version 3.0.7 which is None in notebook(I just checked openpyxl package only)

Looks like you are using different environments