cudf: [BUG] test_reductions.py test case failures
Describe the bug
test_reductions.py test case failures in cuDF Build with commit level: 2cf8a535831dc555020f9668f3b8fd1dc8fb4dcb.
Steps/Code to reproduce bug
site-packages/cudf/tests$ pytest -v test_reductions.py
FAILED tests:
test_reductions.py::test_sum[int8-2] FAILED
test_reductions.py::test_sum[int8-200] FAILED [ 18%]
test_reductions.py::test_sum[int8-10000] FAILED [ 19%]
test_reductions.py::test_min[int8-2] FAILED [ 74%]
test_reductions.py::test_min[int8-3] FAILED [ 74%]
test_reductions.py::test_min[int8-127] FAILED [ 75%]
test_reductions.py::test_min[int8-128] FAILED [ 75%]
test_reductions.py::test_min[int8-129] FAILED [ 76%]
test_reductions.py::test_min[int8-200] FAILED [ 76%]
test_reductions.py::test_min[int8-10000] FAILED
Expected behavior test_reductions.py tests cases should work fine.
Environment details (please complete the following information):
cuDF build on commit:2cf8a535831dc555020f9668f3b8fd1dc8fb4dcb. using GCC 7.3.0 conda tool chain on Ubuntu 18.04 (ppc64le)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 15 (15 by maintainers)
Aha! Good detective work @pradghos. Feel free to open a PR with your fix and link this issue.
I think I am able to find the fix for the problem -
file -
reduce.pyxin functionapply_reduceis basically handling all the reduce operations-get_scalar_value()is returningGDF_INT8scalar data which is nothing butcharinctypedef union gdf_data.In POWER where char is interpreted as unsigned char. below fixes worked fine -
I saw similar problem in libcudf also -
@jrhemstad @harrism @kkraus14 : Pls let me know if I can create PR for the same.
I see that all the failures are on the
numpy.int8type. That sounds suspiciously similar to the narrowing warnings we saw in https://github.com/rapidsai/cudf/issues/1544 caused by the difference in how x86 vs POWER interpretchar.I’d double check what
gdf_dtypeis being passed into libcudf for thenumpy.int8column.This looks to me like the “expected” result is being interpreted as a signed integer, and the “actual” result is being interpreted as unsigned, because they have the same binary representation.
I’m looking the issue.