initialization-actions: Could not use cudf or cuml when rapids-runtime = DASK

I am trying to setup a dataproc cluster with GPU attached, to use cuml and cudf, I followed the instruction https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/rapids/README.md And able to setup the cluster, with nvidia driver successfully installed. But when I try

import cudf

It throws out the error

TypeError: C function cuda.ccudart.cudaStreamSynchronize has wrong signature (expected __pyx_t_4cuda_7ccudart_cudaError_t (__pyx_t_4cuda_7ccudart_cudaStream_t), got cudaError_t (cudaStream_t)

I follow the instruction here: https://docs.rapids.ai/notices/rsn0020/ But after the downgraded version, another error show up when import cudf which is

No module named 'pandas.core.arrays._arrow_utils'

The dask rapids installation version in rapids.sh is 22.04

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 26 (19 by maintainers)

Commits related to this issue

Most upvoted comments

Hi @cjac , could you please update to work with latest dask-rapids v22.12?

I’m about to go on vacation, and I’m trying to put projects down. Can you open a new issue or better yet a GCP support case so I don’t lose track of the work item, please?

This issue is about the action not working. I think it’s working now, but not patched up to latest release. A separate issue would be appropriate.

Please remember to read the README I referenced. You are violating the guidance by using

–initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/gpu/install_gpu_driver.sh,gs://goog-dataproc-initialization-actions-${REGION}/rapids/rapids.sh \