pandas: BUG: read_excel with multi-indexed column ignores index_col=None

From SO: http://stackoverflow.com/questions/34020061/excel-to-pandas-dataframe-using-first-column-as-index

@chris-b1 another one on the multi-index excel issues … 😃

Small test case: content of excel file:

A	A	B	B
key	val	key	val
1	2	3	4
1	2	3	4

gives:

In [2]: pd.read_excel("test_excel_index_col.xlsx", header=[0,1], index_col=None)
Out[2]:
A     A    B
key val key  val
1     2    3   4
1     2    3   4

It’s not super clear in the formatting of the dataframe, but the [1, 1] is the index and [A, key] are seen as the level names of the multi-indexed columns.

About this issue

Original URL
State: closed
Created 9 years ago
Reactions: 5
Comments: 18 (9 by maintainers)

Commits related to this issue

BUG: Don't extract header names if none specified Closes gh-11733. — committed to forking-repos/pandas by gfyoung 6 years ago
BUG: Don't extract header names if none specified Closes gh-11733. — committed to forking-repos/pandas by gfyoung 6 years ago
BUG: Don't extract header names if none specified (#23703) Closes gh-11733. — committed to pandas-dev/pandas by gfyoung 6 years ago
BUG: Don't extract header names if none specified (#23703) Closes gh-11733. — committed to tm9k1/pandas by gfyoung 6 years ago
BUG: Don't extract header names if none specified (#23703) Closes gh-11733. — committed to Pingviinituutti/pandas by gfyoung 6 years ago
BUG: Don't extract header names if none specified (#23703) Closes gh-11733. — committed to Pingviinituutti/pandas by gfyoung 6 years ago

Most upvoted comments

Vote for index_col=False to fix this

the-rccg on Jul 10, 2018

As of now I still see the same issue. when using multi headers with read_excel, pandas always assigns the first column as index.

araespahan on Dec 15, 2017

Per review comment from @jreback here is proposed api for read_excel. It is the same as read_csv.

index_col : int or sequence or False, default None

Column (0-indexed) to use as the row labels of the DataFrame. If a sequence is given, those columns will be combined into a MultiIndex. If None (default), pandas will use the first column as the index. If False, force pandas to not use the first column as the index (row names).

Updates are here:

https://github.com/stephenrauch/pandas/commit/9b37ff94643296d489498138c79bb0244aaa3f79

So if this proposed API looks ok, I will do the PR.

stephenrauch on Mar 14, 2017