pandas: pivot_table not displaying values columns in expected order

Code Sample, a copy-pastable example if possible

# Your code here
column_order = ['Year', 'Month', 'Currency', 'Total Net', 'Fees']
df_fills = df_fills.reindex_axis(column_order, axis = 1)

pd.pivot_table(df_fills, values = ['Total Net', 'Fees'], index = ['Year', 'Month'], 
               columns = ['Currency'], aggfunc = np.sum, margins = True, fill_value = 0)

Problem description

When creating the dataframe, Fees column comes first (it’s from an external data set), while Total Net comes second (it’s calculated). I reordered them using reindex_axis and when asking Python to show the dataframe, I get the expected order. However, when creating a pivot table, Fees always comes first, no matter what.

Expected Output

pivot_table should display columns of values in the order entered in the function.

Output of pd.show_versions()

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.20.2 pytest: 3.0.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.3.0 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 25 (8 by maintainers)

Most upvoted comments

import pandas as pd

tuples_list = [(“John”, “Foo”, 47, 173), (“Michael”, “Bar”, 33, 182)] df = pd.DataFrame(tuples_list, columns=[“firstname”, “lastname”, “age”, “height”]) print(df.pivot_table(index=[“firstname”, “lastname”], values=[“age”, “height”])) print(df.pivot_table(index=[“firstname”, “lastname”], values=[“height”, “age”]))

Create a re-index for your pivot data. See my example below.

import pandas as pd

tuples_list = [("John", "Foo", 47, 173), ("Michael", "Bar", 33, 182)]
df = pd.DataFrame(tuples_list, columns=["firstname", "lastname", "age", "height"])
my_report = df.pivot_table(index=["firstname", "lastname"], values=["age", "height"])

new_order= ["height", "age"]
my_report = my_report.reindex(new_order, axis=1)
my_report

Is someone working on this? I’m not convinced that alphabetical ordering is desirable compared to just taking the order of the values supplied.

After further investigation with other column names, I have determined that the default behavior of pivot_table() is to display the contents of the value parameter in alphabetical order. I will look into making a code contribution to allow for user-defined sorting.

It seems that pivot_table explicitly sorts the index: https://github.com/pandas-dev/pandas/blob/d49a2446e0709f5f3a0c41589a95fccd81ced8cc/pandas/core/reshape/pivot.py#L219-L220

Since v1.3.0, a new parameter sort=True is added to pivot_table. Should we disable index sorting when sort=False is explicitly passed?

I agree with the general sentiment in this discussion. In my case, I have a report where I would like to have the output dataframe showing QTY (quantity), followed by AMT (amount). When I pass {‘QTY’: ‘sum’, ‘AMT’: ‘sum’} to pd.pivot_table, I expected the output dataframe to maintain this order. Currently, pd.pivot_table sorts the output columns alphabetically, putting AMT before QTY, which is not how usually such variables are presented.

This is really not expected behavior. Many students on my courses are asking the same question. It is very confusing behavior. Sorting columns in order - as they are specified in pivot-table - would be much more reasonable & intuitive.

I wanted to add, that not only columns - are currently sorted alphabetically (in output), but also applied functions, if they are specified in list like here: df.pivot_table(index='Region', aggfunc={'OrderAmount': ['sum', 'count', 'mean']})

Final columns will be in different order - 'count', 'mean', 'sum' (i.e. alphabetically again, not in specified order)

the solution from princeinzion is the way, I just want to add that depending on df you are working with , yo could need to add the level when reindexing:

my_report = df.pivot_table(index=[ “ticker”], values=[“c_within_l1”, “c_beyond_l1”, “c_beyond_l2”, “c_beyond_l3”], aggfunc=[np.mean]) new_order=[“c_within_l1”, “c_beyond_l1”, “c_beyond_l2”, “c_beyond_l3”] my_report = my_report.reindex([ new_order, axis=1, level=1) my_report

in this case if you don’t add the level, it throws the tuple error

princeinzion … thanks a lot . reindex resolved the issue

I am having the same issue, here’s a simple example code:

import pandas as pd

tuples_list = [("John", "Foo", 47, 173), ("Michael", "Bar", 33, 182)]
df = pd.DataFrame(tuples_list, columns=["firstname", "lastname", "age", "height"])
print(df.pivot_table(index=["firstname", "lastname"], values=["age", "height"]))
print(df.pivot_table(index=["firstname", "lastname"], values=["height", "age"]))

Both pivot_tables return the same output, however I’d expect the second one to have the height and age columns swapped. If I change the order in ‘index=’ field, it will be reflected in the resulting pivot_table