gsutil: gsutil fails sporadically on macOS

On my Mac (MacBookPro14,3; macOS HighSierra), gsutil (v 4.34) fails sproradically with the following trace:

Traceback (most recent call last):
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gsutil", line 22, in <module>
    gsutil.RunMain()
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gsutil.py", line 117, in RunMain
    sys.exit(gslib.__main__.main())
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 250, in main
    command_runner = CommandRunner()
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 146, in __init__
    self.command_map = self._LoadCommandMap()
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 152, in _LoadCommandMap
    __import__('gslib.commands.%s' % module_name)
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 43, in <module>
    from gslib.utils import copy_helper
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 168, in <module>
    if CheckMultiprocessingAvailableAndInit().is_available else None))
  File "/Users/<me>/google-cloud-sdk/platform/gsutil/gslib/utils/parallelism_framework_util.py", line 84, in __init__
    self.lock = manager.Lock()
  File "/Users/<me>/.pyenv/versions/2.7.15/lib/python2.7/multiprocessing/managers.py", line 670, in temp
    authkey=self._authkey, exposed=exp
  File "/Users/<me>/.pyenv/versions/2.7.15/lib/python2.7/multiprocessing/managers.py", line 733, in __init__
    self._incref()
  File "/Users/<me>/.pyenv/versions/2.7.15/lib/python2.7/multiprocessing/managers.py", line 783, in _incref
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/Users/<me>/.pyenv/versions/2.7.15/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
  File "/Users/<me>/.pyenv/versions/2.7.15/lib/python2.7/multiprocessing/connection.py", line 437, in answer_challenge
    response = connection.recv_bytes(256)        # reject large message
IOError: bad message length

When I modify gslib/utils/parallelism_framework_util.py to prohibit multiprocessing a la Windows, this works fine. I see there is a mod for Alpine Linux (https://github.com/GoogleCloudPlatform/gsutil/commit/d7493e711e5b1c1937d0e57a25544bfba2641eb4). Does something similar need to be done for macOS? I modified the max number of open files and that seemed to help some but did not entirely address the problem.

To reproduce, run a gsutil command from the command line several times (even gsutil -v will break for me, but ls and cp seem to fail more often). At some point you should see the trace. I only see this problem in macOS. Ubuntu 18.04, for example, is fine.

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 8
  • Comments: 28

Commits related to this issue

Most upvoted comments

The above workaround occasionally failed for me. So I ended up updating below file

google-cloud-sdk/platform/gsutil/gslib/utils/parallelism_framework_util.py

in CheckMultiprocessingAvailableAndInit definition I made

multiprocessing_is_available = False

Now it works 100% of the time.

export CLOUDSDK_PYTHON=python3 in ~/.zshrc helped me.

i see the same issue on macOS 10.14.3.

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 240, in serve_client
    request = recv()
  File "/Users/z0033qh/Desktop/y/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 43, in <module>
    from gslib.utils import copy_helper
  File "google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 267, in <module>
    if CheckMultiprocessingAvailableAndInit().is_available else None))
  File "google-cloud-sdk/platform/gsutil/gslib/utils/parallelism_framework_util.py", line 85, in __init__
    self.dict = manager.dict()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 667, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 565, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 437, in answer_challenge
    response = connection.recv_bytes(256)        # reject large message
IOError: bad message length

setting multiprocessing_is_available = False works.

Thanks for the confirmation, @sahidurrahman. From everything I’ve seen, my best guess is that multiple Python installations and/or packages from multiple installations are trying to communicate with each other (i.e. the worker processes that we start up in our multiprocessing setup are somehow using different modules than the parent process). Either that, or multiprocessing is just buggy on macOS for Py2.7. On that note, we’re making good progress on the Python 3 compatibility project, so we should be able to get a Py3 version out relatively soon and see if this still happens in your environments when running on Py3 😃

I’ve been unable to reproduce this on my MBP 😦 I also only have one installation of Python set up on that machine (2.7.13), so that might be why.

Would anyone in this thread be willing to download the non-Cloud-SDK version of gsutil and try the same parallelized commands? I’m actually curious about two installation scenarios:

  1. Use pip to install gsutil in an new, isolated virtualenv running Python 2.7.X (this should prevent any potential mix-ups of Python versions or modules, if that’s somehow happening).
  2. Install gsutil to your system outside of a virtualenv, either via pip install --user or by downloading it from the tarball – while this may not prevent the mix-ups mentioned in #1, it will help me determine whether or not the potential issue below is happening:

I ask this because all of the errors I’ve seen for this thus far have been Cloud SDK installations, and a couple folks have mentioned that fiddling with python* aliases seems to fix it. I wonder if, somehow, the logic in the gcloud launcher script (at <cloud-sdk-root>/bin/gsutil) is picking the “wrong” Python version based on existing aliases? Alternatively, this might just happen if there’s something odd about the python path in your environment, resulting in a mix of either Python versions or module versions being loaded and trying to communicate with each other.

If this still happens in both scenarios 1 and 2 above, I’m inclined to say this is a problem with the multiprocessing module on macOS that isn’t present on other systems, and will just wait and see if it’s been fixed in Python 3 (we’re currently working on PY3 support for gsutil). But if it still happens at that point, we can invest more time looking into this.

But regardless, in the mean time, a good workaround to get parallelism without multiple processes would be to use multiple threads instead, i.e. setting parallel_process_count and parallel_thread_count in your .boto file… or inline, e.g.:

gsutil -o 'GSUtil:parallel_process_count=1' -o 'GSUtil:parallel_thread_count=16' -m cp <src> <dst>

I had to upgrade the asn1crypto library: pip install --upgrade asn1crypto

Not sure if this helps, but I just installed the SDK 257.0.0 version and noticed that the boto install failed during gcloud init. The error message was Error creating a default .boto configuration file. Please run [gsutil config -n] if you would like to create this file.

When I tried running that command, it failed out with a similar error to what is reported here:

Traceback (most recent call last):
  File "/Users/timtrentham/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/Users/timtrentham/google-cloud-sdk/platform/gsutil/gsutil.py", line 124, in RunMain
    sys.exit(gslib.__main__.main())
  File "/Users/timtrentham/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 224, in main
    gslib.command.InitializeMultiprocessingVariables()
  File "/Users/timtrentham/google-cloud-sdk/platform/gsutil/gslib/command.py", line 349, in InitializeMultiprocessingVariables
    total_tasks = AtomicDict(manager=manager)
  File "/Users/timtrentham/google-cloud-sdk/platform/gsutil/gslib/utils/parallelism_framework_util.py", line 87, in __init__
    self.dict = manager.dict()
  File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 667, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 565, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
  File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/connection.py", line 437, in answer_challenge
    response = connection.recv_bytes(256)        # reject large message
IOError: bad message length

After making sure to run pyenv shell system prior to running gsutil, it succeeds.