pandas: TypeError: unhashable type: 'dict' when using apply/transform?

Hello!

I am quite puzzled by some inconsistencies when using apply. Consider this simple example

idx=[pd.to_datetime('2012-02-01 14:00:00') , 
     pd.to_datetime('2012-02-01 14:01:00'),
     pd.to_datetime('2012-03-05 14:04:00'),
     pd.to_datetime('2012-03-05 14:01:00'),
     pd.to_datetime('2012-03-10 14:02:00'),
     pd.to_datetime('2012-03-11 14:07:50')
     ]

test=pd.DataFrame({'value1':[1,2,3,4,5,6],
                   'value2':[10,20,30,40,50,60],
                   'groups' : ['A','A','A','B','B','B']},
    index=idx)

test
Out[22]: 
                    groups  value1  value2
2012-02-01 14:00:00      A       1      10
2012-02-01 14:01:00      A       2      20
2012-03-05 14:04:00      A       3      30
2012-03-05 14:01:00      B       4      40
2012-03-10 14:02:00      B       5      50
2012-03-11 14:07:50      B       6      60

Now, this WORKS

test.groupby('groups').apply(lambda x: x.resample('1 T', label='left', closed='left').apply(
        {'value1' : 'mean',
         'value2' : 'mean'}))

but this FAILS

test.groupby('groups').apply(
        {'value1' : 'mean',
         'value2' : 'mean'})

Traceback (most recent call last):

  File "<ipython-input-24-741304ecf105>", line 3, in <module>
    'value2' : 'mean'})

  File "C:\Users\\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 696, in apply
    func = self._is_builtin_func(func)

  File "C:\Users\\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\base.py", line 730, in _is_builtin_func
    return self._builtin_table.get(arg, arg)

TypeError: unhashable type: 'dict'

This worked in prior versions of Pandas. What is the new syntax then? Some very useful variant of the code above I used to use was:

test.groupby('groups').apply(
        {'newname1' : {'value1' : 'mean'},
         'newname2' : {'value2' : 'mean'}})

to rename the new variables on the fly. Is this still possible now? Is this a bug?

Many thanks!

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 2
  • Comments: 29 (11 by maintainers)

Most upvoted comments

@WillAyd Just so I’m clear, you’re suggesting something like: test.groupby('groups').transform({'value1': [np.mean, max], 'value2': max}) which should return something like:

                    value1     value2    
                      mean max   max
2012-02-01 14:00:00      2   3    30
2012-02-01 14:01:00      2   3    30
2012-03-05 14:04:00      2   3    30
2012-03-05 14:01:00      5   6    60
2012-03-10 14:02:00      5   6    60
2012-03-11 14:07:50      5   6    60

Is there any reason the documentation says that transform takes a dictionary, when it doesn’t?

maybe could provide a more helpful error message (with link to groupby.transform/apply docs) and maybe raise NotImplementedError in the short term

Can this then be taken as a feature request, so that the same kind of apply/transform usage be used on both DataFrame and GroupBy objects?

@zeromh The referenced documentation where transform accepts lists and dictionaries is for the dataframe method of transform, not its groupby cousin version. The doc string for the groupby version correctly states that it accepts a function:

Signature: gb.transform(func, *args, **kwargs)
Docstring:
Call function producing a like-indexed DataFrame on each group and
return a DataFrame having the same indexes as the original object
filled with the transformed values

Parameters
----------
f : function
    Function to apply to each subframe