gsutil: gsutil breaks after updating to SDK 298 on OS X

SDK 297 was fine

+ gcloud version
Google Cloud SDK 298.0.0
beta 2020.06.19
bq 2.0.58
core 2020.06.19
gsutil 4.51
kubectl 2020.05.01

+ gsutil -m rsync -r -c -x '^\.|.*\.js\.map$' . gs://croquet.io/

WARNING: You have requested checksumming but your crcmod installation isn't
using the module's C extension, so checksumming will run very slowly. For help
installing the extension, please see "gsutil help crcmod".

Building synchronization state...
Starting synchronization...
module 'sys' has no attribute 'maxint'
CommandException: 1 files/objects could not be copied/removed.
+ echo 'Fixing metadata...'
Fixing metadata...
+ gsutil -m -q setmeta -h Content-Type:text/html -h 'Cache-Control:public, max-age=60' 'gs://croquet.io/**.html'
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/managers.py", line 749, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2348, in run
    cls = copy.copy(class_map[caller_id])
  File "<string>", line 2, in __getitem__
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/managers.py", line 753, in _callmethod
    self._connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/managers.py", line 740, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 487, in Client
    c = SocketClient(address)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 61] Connection refused

This is on macOS Catalina 10.15.5:

$ gsutil version -l
gsutil version: 4.51
checksum: a4c57d9b2479f11efe1b0ffb6470c0c5 (OK)
boto version: 2.49.0
python version: 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 03:03:55) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
OS: Darwin 19.5.0
multiprocessing available: True
using cloud sdk: True
pass cloud sdk credentials to gsutil: True
config path(s): /Users/vanessa/.boto
gsutil path: /usr/local/google-cloud-sdk/bin/gsutil
compiled crcmod: False
installed via package manager: False
editable install: False

The same command works fine again after reverting to 297 that I had installed previously.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 28
  • Comments: 33

Most upvoted comments

Still broken in Cloud SDK 300.0.0 on macOS 10.15.5 (Python 3.8.3).

ok, thanks. For anyone else running into the AttributeError: module 'gslib' has no attribute 'USER_AGENT' issue, rolling back to a previous version will fix the problem until 302.0.0 is released: gcloud components update --version 297.0.1

You might not need to back that far but I can verify that 291.0.1 works (at least for me)

@martindufort , because I had the same problem… you’ll need to update crcmod as stated before. In my machine™️ upgrading globally helped pip3 install -U crcmod. Might help for the next soul that arrives here from google…

@otter-in-a-suit Thanks for the information!

Regarding AttributeError: module 'gslib' has no attribute 'USER_AGENT' this is a known bug that we fixed after gsutil v4.51 was released in https://github.com/GoogleCloudPlatform/gsutil/commit/f8f00d01e8fb10d1d31cb15c4050536d1e900401 . The fix has been merged and it will be made available in the gcloud sdk binary in the next gsutil release.

@codefrau For the module 'sys' has no attribute 'maxint' error, it is getting raised from the crcmod-osx module. This looks like a bug in https://github.com/gsutil-mirrors/crcmod-osx where it is calling sys.maxint and maxint doesn’t exist in python3. As a quick fix, you can try installing the crcmod by following the steps here https://cloud.google.com/storage/docs/gsutil/addlhelp/CRC32CandInstallingcrcmod#macos

This AttributeError: module 'gslib' has no attribute 'USER_AGENT' only happens when using the -m flag. It’s caused by a missing attribute (USER_AGENT in gslib/__init__.py).

I was able to fix it by manually merging this commit: https://github.com/GoogleCloudPlatform/gsutil/commit/f8f00d01e8fb10d1d31cb15c4050536d1e900401

Which simply adds the USER_AGENT variable back. Not sure why that is not in the official binary/archive.

Another issue is macOS and Python 3.8 specific, which of course, is a wonderful combination (I’m not bitter, you are!): https://bugs.python.org/issue33725 and https://github.com/GoogleCloudPlatform/gsutil/issues/961 give some hints. This can be resolved by upgrading Python or by just glueing it together and hoping for the best: https://github.com/python/cpython/pull/13603/commits/bc366964d2dabcf14427604a2322fa6644023132

Since I still got TypeError: cannot pickle '_io.TextIOWrapper', this one is really funny.

It hits the multiprocessing library reduction.py->dump() method, where it passes both a gsutil.cp process and a dict that starts with {'log_to_stderr': False, 'authkey'.... Apparentlygsutil tries to start a dict as a process somehow.

I hence “fixed” this by adding:

def dump(obj, file, protocol=None):
    '''Replacement for pickle.dump() using ForkingPickler.'''
    if type(obj) == dict:
        return
    ForkingPickler(file, protocol).dump(obj)

in multiprocessing.reduction.dump(), which is more a joke than a fix. But it does tell me that somehow, this funky dict is generated somewhere. I’ll just downgrade, but maybe one of the Google folks can look at that. Looks like a dict of what I assume are environment variables somehow make their way into the process pool.

I also have an issue with gcloud 298’s gsutil on OS X. My error occurs when I run a cp operation and works fine again after downgrading to 297.

I’ve anonymized my output but it looks like:

/Users/secretuser/google-cloud-sdk/bin/gsutil -q cp -n testfile gs://bucket/hidden/testfile

File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gsutil.py", line 123, in RunMain
    sys.exit(gslib.__main__.main())
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 429, in main
    return _RunNamedCommandAndHandleExceptions(
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 767, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 625, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
    return_code = command_inst.RunCommand()
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1205, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1485, in Apply
    caller_id = self._SetUpPerCallerState()
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1360, in _SetUpPerCallerState
    class_map[caller_id] = cls
  File "<string>", line 2, in __setitem__
  File "/Users/secretuser/.pyenv/versions/3.8.3/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/managers.py", line 850, in _callmethod
    raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/secretuser/.pyenv/versions/3.8.3/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/managers.py", line 243, in serve_client
    request = recv()
  File "/Users/secretuser/.pyenv/versions/3.8.3/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 30, in <module>
    from gslib.command import Command
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/command.py", line 50, in <module>
    from gslib.cloud_api_delegator import CloudApiDelegator
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 26, in <module>
    from gslib.cs_api_map import ApiMapConstants
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/cs_api_map.py", line 23, in <module>
    from gslib.gcs_json_api import GcsJsonApi
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 72, in <module>
    from gslib.third_party.storage_apitools import storage_v1_client as apitools_client
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/third_party/storage_apitools/storage_v1_client.py", line 26, in <module>
    class StorageV1(base_api.BaseApiClient):
  File "/Users/secretuser/google-cloud-sdk/platform/gsutil/gslib/third_party/storage_apitools/storage_v1_client.py", line 38, in StorageV1
    _USER_AGENT += gslib.USER_AGENT
AttributeError: module 'gslib' has no attribute 'USER_AGENT'

I fixed the module 'sys' has no attribute 'maxint' error with the following steps:

  1. Run gcloud info
  2. Note down the Python Location as <python_location>
  3. Run <python_location> -m pip install crcmod

I’m using Homebrew Python, which is currently at v 3.10.7, but when I ran gcloud info, I saw that the Python version was 3.9.14 (different than brew’s current Python version). Directly running pip3 install -U crcmod did not work as it was installing crcmod for Python3.10, which isn’t the Python used by gsutil. Hope this helps others who experience the same problem!

I will close this based on https://github.com/GoogleCloudPlatform/gsutil/issues/1123.

@martindufort The maxint seems to be an issue because of crcmod which is a separate issue and is not related to the multiprocessing issue discussed here. Please install crcmod directly to fix it. You can refer to https://github.com/GoogleCloudPlatform/gsutil/issues/1123 to learn more about the crcmod issue. The compiled crcmod library shipped with gsutil is broken for Python3 and hence we recommend installing crcmod directly. Feel free to file a separate issue if that does not work for you.

Thanks!

In 303.0.0 now getting TypeError: cannot pickle '_io.TextIOWrapper' objec

Getting this error with this Cloud SDK version:

Google Cloud SDK 325.0.0
beta 2021.01.22
bq 2.0.64
cloud-datastore-emulator 2.1.0
core 2021.01.22
gcloud 
gsutil 4.58

when trying to synchronize.

Building synchronization state...
Starting synchronization...
module 'sys' has no attribute 'maxint'

: python --version                                                                                                                    
Python 3.7.1

21st July, if nothing blocks the release

Unfortunately, the fix for AttributeError: module 'gslib' has no attribute 'USER_AGENT' was not rolled out in the 301.0.0 release. It will be part of 302.0.0. Sorry for the delay.

@dilipped yes I can reproduce. As soon as I update, it breaks:


$ gsutil -m rsync -r -c -x '^\.|.*\.js\.map$' . gs://croquet.io/
Building synchronization state...
Starting synchronization...

$ gcloud version
Google Cloud SDK 297.0.0
beta 2019.05.17
bq 2.0.58
core 2020.06.12
gsutil 4.51
kubectl 2020.05.01
Updates are available for some Cloud SDK components.  To install them,
please run:
  $ gcloud components update

$ sudo gcloud components update 


Your current Cloud SDK version is: 297.0.0
You will be upgraded to version: 298.0.0

┌─────────────────────────────────────────────────────────────────────────────┐
│                      These components will be updated.                      │
├─────────────────────────────────────────────────────┬────────────┬──────────┤
│                         Name                        │  Version   │   Size   │
├─────────────────────────────────────────────────────┼────────────┼──────────┤
│ BigQuery Command Line Tool (Platform Specific)      │     2.0.58 │  < 1 MiB │
│ Cloud SDK Core Libraries                            │ 2020.06.19 │ 15.0 MiB │
│ Cloud SDK Core Libraries (Platform Specific)        │ 2020.06.19 │  < 1 MiB │
│ Cloud Storage Command Line Tool (Platform Specific) │       4.51 │  < 1 MiB │
│ gcloud cli dependencies                             │ 2020.06.19 │  3.4 MiB │
└─────────────────────────────────────────────────────┴────────────┴──────────┘

...

Update done!

To revert your SDK to the previously installed version, you may run:
  $ gcloud components update --version 297.0.0

$ gsutil -m rsync -r -c -x '^\.|.*\.js\.map$' . gs://croquet.io/

WARNING: You have requested checksumming but your crcmod installation isn't
using the module's C extension, so checksumming will run very slowly. For help
installing the extension, please see "gsutil help crcmod".

Building synchronization state...
Starting synchronization...
module 'sys' has no attribute 'maxint'
CommandException: 1 files/objects could not be copied/removed.
$ 

I can confirm that manually installing crcmod is valid workaround.

The only tricky thing is that you have to identify which python gsutil is using, and hence the corresponding pip. If you have configured the CLOUDSDK_PYTHON environment variable, the path is easy to be identified. If not, check the python version via gsutil version -l. 😉

Can you update us when this is fixed? It’s still an issue for me (after updating gcloud components).

#1107 Is not deployed yet. We are working on the release and it should be out by next week or the week after. The PR does not address the crcmod issue. For crcmod related error, installing the library directly should resolve the issue - https://cloud.google.com/storage/docs/gsutil/addlhelp/CRC32CandInstallingcrcmod

@Amzd which python version are you using? You can check that by doing gsutil ver -l. Make sure you are installing crcmod for the correct python version. If you have multiple Python binaries available on your system, it is possible that gcloud is running on one python version but the crcmod is getting installed for a different python version.

You can check your python path by running gcloud info. Then you can run <your python path> -m pip install crcmod to install crcmod for that particular python version.

In 303.0.0 now getting TypeError: cannot pickle '_io.TextIOWrapper' objec

Please see https://github.com/GoogleCloudPlatform/gsutil/issues/961#issuecomment-663565856. It solved the problem for me.

I just wanted to point out that 303.0.0 only fixes theAttributeError: module 'gslib' has no attribute 'USER_AGENT' issue. The other two issues have not been fixed yet. For the maxint issue, the work around would be to install crcmod library directly instead of relying on the one shipped with gsutil for macOS. Instructions can be found here https://cloud.google.com/storage/docs/gsutil/addlhelp/CRC32CandInstallingcrcmod.