ggpy: ggplot doesn't work with latest pandas dataframe

Here’s the error I get. This is due to dataframe api change.

'DataFrame' object has no attribute 'sort'

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 9
  • Comments: 24

Most upvoted comments

It is October already and pandas advanced to version 0.20.3. This problem should be fixed as it renders ggplot unusable for a lot of use-cases.

This is actually already fixed in the code (https://github.com/yhat/ggpy/commit/9d00182343eccca6486beabd256e8c89fb0c59e8). If you

$ pip uninstall ggplot
$ pip install git+git://github.com/yhat/ggpy.git@9d00182343eccca6486beabd256e8c89fb0c59e8 --no-cache

it should work.

@chethanjjj Though it’s an awesome plotting library, for me it’s the Grammar in R’s ggplot2 that matters, so Seaborn won’t suffice. If you’re in my camp, try Plotnine, which I found from this tutorial from @pteehan.

On line 602 of ggplot/ggplot.py, I changed the line from

fill_levels = self.data[[fillcol_raw, fillcol]].sort(fillcol_raw)[fillcol].unique()

to

fill_levels = self.data[[fillcol_raw, fillcol]].sort_values(by=fillcol_raw)[fillcol].unique()

and it worked for my use case. Hope that helps…

In case it helps anyone, I received a similar error when trying to import ggplot:

Traceback (most recent call last):
  File "scratch.py", line 3, in <module>
    from ggplot import *
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/__init__.py", line 19, in <module>
    from .geoms import geom_area, geom_blank, geom_boxplot, geom_line, geom_point, geom_jitter, geom_histogram, geom_density, geom_hline, geom_vline, geom_bar, geom_abline, geom_tile, geom_rect, geom_bin2d, geom_step, geom_text, geom_path, geom_ribbon, geom_now_its_art, geom_violin, geom_errorbar, geom_polygon
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/geoms/__init__.py", line 1, in <module>
    from .geom_abline import geom_abline
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/geoms/geom_abline.py", line 1, in <module>
    from .geom import geom
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/geoms/geom.py", line 3, in <module>
    from ..ggplot import ggplot
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/ggplot.py", line 13, in <module>
    from .aes import aes
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/aes.py", line 11, in <module>
    from . import utils
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ggplot/utils.py", line 81, in <module>
    pd.tslib.Timestamp,
AttributeError: 'module' object has no attribute 'tslib'

Here are my pkg vers:

➜  sandbox-py pip show ggplot
Name: ggplot
Version: 0.11.5
Summary: ggplot for python
Home-page: https://github.com/yhat/ggplot/
Author: Greg Lamp
Author-email: greg@yhathq.com
License: BSD
Location: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Requires: six, statsmodels, brewer2mpl, matplotlib, scipy, patsy, pandas, cycler, numpy
Required-by: 
You are using pip version 10.0.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
➜  sandbox-py pip show pandas
Name: pandas
Version: 0.24.1
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: http://pandas.pydata.org
Author: None
Author-email: None
License: BSD
Location: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Requires: python-dateutil, pytz, numpy
Required-by: statsmodels, plotnine, mizani, ggplot, geopandas
You are using pip version 10.0.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

I first tried this, but it did not resolve my error:

$ pip uninstall ggplot
$ pip install git+git://github.com/yhat/ggpy.git@9d00182343eccca6486beabd256e8c89fb0c59e8 --no-cache

Because “There hasn’t been an update to master since Nov 2016. At this point I think it’s safe to say this package isn’t supported anymore” - @asbhat , I switched over to plotnine, which (as far as I can tell) has the same API as ggplot. Life is good again.

Any news regarding this issue?

If people are interested in playing whack-a-mole through the python, in addition to [jmcarpenter2 's] (https://github.com/yhat/ggpy/issues/612#issuecomment-344025852) comment, there is another instance of sort in ggplot/stats/stat_smooth.py

Change

smoothed_data = smoothed_data.sort('x')

to

smoothed_data = smoothed_data.sort_values('x')