cudf: [BUG] Calling sum on a boolean column returns a boolean

import cudf
import numpy as np
s = cudf.Series(np.arange(100000))
(s > 1).sum()
True

I believe that this should return something lik 999999, and that we should have a method all and any which is closer to the current behavior.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (14 by maintainers)

Commits related to this issue

Most upvoted comments

I think that the numpy approach of scaling up to platform integer size will lead to the least amount of surprise (and overflow) among users. Do you agree @kkraus14 ?