pandas: ENH/BUG: Rename of MultiIndex DataFrames does not work

xref #14139 for empty MI

Hi everybody,

in the current version renaming of MultiIndex DataFrames does not work. Lets take the following example:

import datetime as DT
import pandas as pd
df = pd.DataFrame({
'Branch' : 'A A A A A B'.split(),
'Buyer': 'Carl Mark Carl Joe Mark Carl'.split(),
'Quantity': [1,3,5,8,9,3],
'Date' : [
    DT.datetime(2013,9,1,13,0),
    DT.datetime(2013,9,1,13,5),
    DT.datetime(2013,10,1,20,0),
    DT.datetime(2013,10,3,10,0),
    DT.datetime(2013,12,2,12,0),                                      
    DT.datetime(2013,12,2,14,0),
    ]})

and the following query:

test_df = df[df['Buyer'].isin(['Carl', 'Mark'])].set_index('Buyer', append=True)[['Date']].unstack(['Buyer'])

Now, the following renaming does not work

test_df.rename(columns={('Date', 'Carl'): 'Carl'}, inplace=True)

Thanks in advance

Andy

About this issue

  • Original URL
  • State: open
  • Created 11 years ago
  • Comments: 25 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Just sharing a workaround: renaming of tuples works for flattened indices.

df = pd.DataFrame([[1,2,3],[3,4,5],[5,6,7], [7,8,9]])
df.columns = pd.MultiIndex.from_tuples([('i','a'),('i','b'),('ii','a')])

# Alternative 1
df.columns = df.columns.to_flat_index()
df = df.rename(columns={('i','b'):('i','c')})
df.columns = pd.MultiIndex.from_tuples(df.columns)

# Alternative 2
i = df.columns.get_loc(('i','b'))
cols = df.columns.to_flat_index()
cols[i] = ('i','c')
df.columns = pd.MultiIndex.from_tuples(cols)

a multi-index renam at the moment does not work at all. The issue is how do you rename only part of a level

e.g. red-> orange, how should I do this? or abspx -> foo

df.rename(columns={'red' : 'orange' }, level=1)
df.rename(columns={'abspx' : 'foo'},level=0)

but no way to do this (well it doesn’t work), but does make sense

df.rename(columns={('abspx','red) : ('foo','orange')})

Just stumbled over this problem. The delicate part is that the function will always return successfully even if errors="raise" is passed.

Thanks to normanius for the workaround, if a fix is not that easy, maybe the docstring could be extended with a warning?

Is there any news on this? This bug is a constant annoyance when trying to do complex aggregations. Since the ‘agg’ method doesn’t have a way to assign new names to the aggregate columns the recommended method (AFAIK) is to first aggregate and then rename. But something like

df.groupby('A').agg({'B': ['mean', 'median'], 'C': ['min', 'max']}

returns a multi-index which I then cannot rename. I actually had to write my own wrapper function for aggregations as a workaround. (Happy to share the code if there’s interest.)