cudf: [BUG] cuDF 10.0 RMM_FREE: global function call is not configured

I conda installed rapids 0.10 but the kernel die when I try to read a parquet file with either cudf.read_parquet or dask_cudf.read_parquet.

conda install -c rapidsai -c nvidia -c conda-forge rapids=0.10 rapids-xgboost dask python=3.7 cudatoolkit=10.0 ipykernel boto3 boto s3fs idna=2.7 PyYAML=3.13 urllib3=1.24.3

Jupyter log error:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  rmm_allocator::deallocate(): RMM_FREE: __global__ function call is not configured

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 1
Comments: 17 (14 by maintainers)

Most upvoted comments

If there’s another thrust allocator that throws in its deallocate method then it would have the same throws-within-destructor issue.

jlowe on Dec 3, 2019

I believe this error is triggered by an exception that gets thrown in a destructor. If you’re willing to build cudf from source, one way to try to narrow down where the error is occurring is to instrument the RMM_TRY and CUDA_TRY macros to log the __FILE__ and __LINE__ to stderr just before the throw call. Then hopefully it will show where the original error occurs that gets obscured by the thrust system error, and that may shed light onto what the real problem is.

Speaking of errors being thrown while cleaning up from an error, there are many places in the code that throw when a CUDA error occurs without clearing the error. As the stack gets unrolled and destructors invoked, any destructor that also checks and throws on a CUDA error is going to trigger this type of issue. Is there a reason to leave the CUDA error pending if the exception being thrown contains the detail of the CUDA error? cc: @harrism @jrhemstad

jlowe on Nov 10, 2019

cudf: [BUG] cuDF 10.0 RMM_FREE: __global__ function call is not configured

About this issue

Most upvoted comments

cudf: [BUG] cuDF 10.0 RMM_FREE: global function call is not configured