gsutil: TypeError: cannot pickle '_io.TextIOWrapper' object

gsutil -m -h "Cache-Control: public, max-age=31536000" cp -r test/** gs://some-bucket
Traceback (most recent call last):
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil.py", line 124, in RunMain
    sys.exit(gslib.__main__.main())
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 424, in main
    return _RunNamedCommandAndHandleExceptions(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 762, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 620, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
    return_code = command_inst.RunCommand()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1201, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1499, in Apply
    self._ParallelApply(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1719, in _ParallelApply
    self._CreateNewConsumerPool(process_count, thread_count,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1384, in _CreateNewConsumerPool
    p.start()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object

gsutil version: 4.47

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 63
  • Comments: 19

Commits related to this issue

Most upvoted comments

gsutil does not work with python 3.8, force it to use python 3.7 with something like

export CLOUDSDK_PYTHON=/usr/bin/python3     # on mac
export CLOUDSDK_PYTHON=/usr/bin/python3.7   # on linux

Another workaround on macOS is to

brew install python@3.7
export CLOUDSDK_PYTHON=/usr/local/opt/python@3.7/bin/python3

@aleb I’m not sure if this is specific to Mac Mojave, but the path for python3 for me was /usr/local/bin/python3. I couldn’t get it to work with python3 anyways, but forcing it to use 2.7 worked like a charm.

export CLOUDSDK_PYTHON=/usr/local/bin/python3      # did not work
export CLOUDSDK_PYTHON=/usr/bin/python2.7          # worked

From the link @caizixian provided,

Python 3 is preferred over Python 2. Note that gcloud requires Python version 2.7.x or 3.5 and up. Other Python tools shipped in the Cloud SDK do not support Python 3 and require Python 2.7.x,

The issue still presents with Cloud SDK 302.0.0 (gsutil 4.52), on macOS 10.15.6 with Python 3.8.5 installed from homebrew

Still exists now, only on multiprocessing flag, runs fine without -m :

Google Cloud SDK 286.0.0 bq 2.0.55 core 2020.03.24 gsutil 4.48

Sorry for the delay. We are aware of this bug and we are working on releasing this workaround soon https://github.com/GoogleCloudPlatform/gsutil/pull/1107

export CLOUDSDK_PYTHON=/usr/bin/python2.7 will work ! export CLOUDSDK_PYTHON=/usr/bin/python3 or export CLOUDSDK_PYTHON=path/for/python3.7 will solve the current issue but will run into module ‘sys’ has no attribute ‘maxint’ error.

One workaround is to use the Python 3 interpreter shipped with macOS /usr/bin/python3 by setting the Cloud SDK interpreter path https://cloud.google.com/sdk/gcloud/reference/topic/startup

Tracking this down, this error comes from a change in Python 3.8 in the multiprocessing library:

Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.

Spawn is being run for those using MacOs and Python 3.8+ by default since nothing is explicitly set either through get_context or set_start_method.

updating gsutil solved the issue with python3.8

I ran into the same issue. If you use another interpreter (python 3.7 for instance) all is well. This is a problem specifically with Python 3.8

Google Cloud SDK 281.0.0 beta 2019.05.17 bq 2.0.53 cloud-firestore-emulator 1.10.4 core 2020.02.14 gsutil 4.47

Another workaround would be to disable multiprocessing altogether when using Python 3.8. This can be done either by setting the parallel_process_count=1 in the boto config file or by passing the option from the command line like this

gsutil -o "GSUtil:parallel_process_count=1" -m cp .....

This will be relatively slow as it will be using a single process, however, multithreading will be still ON.

With such a strange “pickle” error, I didn’t expect to find my resolution so quickly. Thank you, @dinvlad!!

@dinvlad Thank you! Works perfectly!

@dinvlad That worked for me! Thank you so much