distributed: HLG serialization bug

What happened: Serialization fails with TypeError: can not serialize 'numpy.uint16' object when using master.

What you expected to happen: Serialization should succeed.

Minimal Complete Verifiable Example:

import dask.array as da
import numpy as np
from distributed.protocol.highlevelgraph import highlevelgraph_pack

def fn(x, dt):
    return x.astype(dt)

arr = da.blockwise(fn, "x", da.ones(1000), "x", np.uint16(0), None, dtype=np.uint16)

highlevelgraph_pack(arr.__dask_graph__(), None, None)

Anything else we need to know?: This happens for a number of numpy data types.

Environment:

  • Dask version: 2021.02.0+3.ga14e47f
  • Distributed version: 2021.02.0+1.g7d2a22f
  • Python version: 3.7.6
  • Operating System: PoP!_OS 18.04 LTS
  • Install method (conda, pip, source): pip

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 31 (17 by maintainers)

Most upvoted comments

This seems to have resolved the problem! Thanks @madsbk! I will be doing further testing over the coming week and will let you know if I encounter any further issues.

Awesome, thanks @JSKenyon

@jakirkham Will do! If not today, then tomorrow.

Yeah NumPy 1.20 made some significant changes with dtypes. It might be worth retrying with 1.19 to see if this is related to those changes in NumPy

Sure! Will let you know what happens.

I agree, this is a regression and if it something that is useful for people we should support it 😃