distributed: BlockwiseIO is not always msgpack serializable

Multiple TypeErrors are raised during (de-)serialization on master CI

TypeError: can not serialize 'CreateArraySubgraph' object

E.g. https://travis-ci.org/github/dask/distributed/jobs/750189428

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

This was reverted in PR ( https://github.com/dask/dask/pull/6995 ). Though leaving open to discuss reintegrating the original change with fixes

I think it would be fine to revert this dask/dask#6931 for the time being. A “Step 1” fix may be easy, “Step 2” perhaps less so, but in either event it would give us the space to talk through it.

Short term, I wonder if it makes sense to temporarily revert https://github.com/dask/dask/pull/6931? That way CI for distributed would be unblocked until we have a fix like https://github.com/dask/distributed/issues/4374#issuecomment-748217714 implemented. I’d be happy to learn if we think https://github.com/dask/distributed/issues/4374#issuecomment-748217714 is tractable and could be implemented soon-ish, but with folks beginning to take off for the holidays it would be nice to leave distributed’s test suite in a runnable state.

Does that seem reasonable to you @rjzamora? 🙂

@madsbk and I had a chat about this earlier today. It looks like we should be able to use dumps_task to serialize io_deps (after materializing the subgraphs). From there, it is probably just a matter of accounting for stringified key names during the injection of IO tasks in the resulting Blockwise graph.