google-cloud-ruby: Broken pipe when uploading files (cloud storage)
We use GCS to store backups and for the last two days multiple servers have thrown a ‘Broken pipe’ exception when performing their backups.
I’ve tried uploading a similar size file on my machine locally but it throws the same exception so I don’t think it’s a network related issue.
Strangely, I can upload a 100MB file but trying to upload a 1.29GB file or larger causes the broken pipe.
Here’s a test script I’m using locally:
#!/usr/bin/ruby
require 'logger'
require 'google/cloud/storage'
logger = Logger.new($stderr)
logger.level = Logger::DEBUG
Google::Apis.logger = logger
storage = Google::Cloud::Storage.new(
timeout: 600,
project: 'nialto-services',
keyfile: File.expand_path('~/gcs/nialto-services-a6dbccf5fe51.json')
)
bucket = storage.bucket('nialto-testing')
local_path = File.expand_path('~/gcs/20181010080000_example.com.1.dar')
remote_path = File.join('files', 'example.com', File.basename(local_path))
p bucket.create_file(local_path, remote_path)
And here’s the output of uploading a 1.29GB file:
D, [2018-10-11T11:45:46.610028 #8992] DEBUG -- : Sending HTTP get https://www.googleapis.com/storage/v1/b/bucket-name?
D, [2018-10-11T11:45:46.984530 #8992] DEBUG -- : 200
D, [2018-10-11T11:45:46.984900 #8992] DEBUG -- : #<HTTP::Message:0x00007fb8122e0190 @http_header=#<HTTP::Message::Headers:0x00007fb8122e0168 @http_version="1.1", @body_size=0, @chunked=false, @request_method="GET", @request_uri=#<Addressable::URI:0x3fdc09170ce4 URI:https://www.googleapis.com/storage/v1/b/bucket-name?>, @request_query=nil, @request_absolute_uri=nil, @status_code=200, @reason_phrase="OK", @body_type=nil, @body_charset=nil, @body_date=nil, @body_encoding=#<Encoding:UTF-8>, @is_request=false, @header_item=[["X-GUploader-UploadID", "xxxx"], ["ETag", "CAI="], ["Vary", "Origin"], ["Vary", "X-Origin"], ["Content-Type", "application/json; charset=UTF-8"], ["Expires", "Thu, 11 Oct 2018 10:45:46 GMT"], ["Date", "Thu, 11 Oct 2018 10:45:46 GMT"], ["Cache-Control", "private, max-age=0, must-revalidate, no-transform"], ["Content-Length", "370"], ["Server", "UploadServer"], ["Alt-Svc", "quic=\":443\"; ma=2592000; v=\"44,43,39,35\""]], @dumped=false>, @peer_cert=#<OpenSSL::X509::Certificate: subject=#<OpenSSL::X509::Name CN=*.googleapis.com,O=Google LLC,L=Mountain View,ST=California,C=US>, issuer=#<OpenSSL::X509::Name CN=Google Internet Authority G3,O=Google Trust Services,C=US>, serial=#<OpenSSL::BN:0x00007fb810a92e30>, not_before=2018-09-18 12:34:00 UTC, not_after=2018-12-11 12:34:00 UTC>, @http_body=#<HTTP::Message::Body:0x00007fb81233fcd0 @body="{\n \"kind\": \"storage#bucket\",\n \"id\": \"bucket-name\",\n \"selfLink\": \"https://www.googleapis.com/storage/v1/b/bucket-name\",\n \"projectNumber\": \"000000000000\",\n \"name\": \"bucket-name\",\n \"timeCreated\": \"2018-10-11T10:01:50.330Z\",\n \"updated\": \"2018-10-11T10:05:04.537Z\",\n \"metageneration\": \"2\",\n \"location\": \"EUROPE-WEST1\",\n \"storageClass\": \"REGIONAL\",\n \"etag\": \"CAI=\"\n}\n", @size=0, @positions=nil, @chunk_size=nil>, @previous=nil>
D, [2018-10-11T11:45:46.986936 #8992] DEBUG -- : Success - #<Google::Apis::StorageV1::Bucket:0x00007fb810a926b0
@etag="CAI=",
@id="bucket-name",
@kind="storage#bucket",
@location="EUROPE-WEST1",
@metageneration=2,
@name="bucket-name",
@project_number=000000000000,
@self_link="https://www.googleapis.com/storage/v1/b/bucket-name",
@storage_class="REGIONAL",
@time_created=
#<DateTime: 2018-10-11T10:01:50+00:00 ((2458403j,36110s,330000000n),+0s,2299161j)>,
@updated=
#<DateTime: 2018-10-11T10:05:04+00:00 ((2458403j,36304s,537000000n),+0s,2299161j)>>
D, [2018-10-11T11:45:47.006798 #8992] DEBUG -- : Sending upload start command to https://www.googleapis.com/upload/storage/v1/b/bucket-name/o?name=files%2Fexample.com%2F20181010080000_example.com.1.dar
D, [2018-10-11T11:45:47.089064 #8992] DEBUG -- : Upload status active
D, [2018-10-11T11:45:47.089145 #8992] DEBUG -- : Sending upload command to https://www.googleapis.com/upload/storage/v1/b/bucket-name/o?name=files%2Fexample.com%2F20181010080000_example.com.1.dar&upload_id=xxxx&upload_protocol=resumable
D, [2018-10-11T11:45:52.167122 #8992] DEBUG -- : Error - #<HTTPClient::KeepAliveDisconnected: HTTPClient::KeepAliveDisconnected: Broken pipe>
Traceback (most recent call last):
28: from Untitled.rb:22:in `<main>'
27: from /usr/local/lib/ruby/gems/2.5.0/gems/google-cloud-storage-1.15.0/lib/google/cloud/storage/bucket.rb:1142:in `create_file'
26: from /usr/local/lib/ruby/gems/2.5.0/gems/google-cloud-storage-1.15.0/lib/google/cloud/storage/service.rb:309:in `insert_file'
25: from /usr/local/lib/ruby/gems/2.5.0/gems/google-cloud-storage-1.15.0/lib/google/cloud/storage/service.rb:568:in `execute'
24: from /usr/local/lib/ruby/gems/2.5.0/gems/google-cloud-storage-1.15.0/lib/google/cloud/storage/service.rb:310:in `block in insert_file'
23: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/generated/google/apis/storage_v1/service.rb:1898:in `insert_object'
22: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/base_service.rb:360:in `execute_or_queue_command'
21: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/http_command.rb:93:in `execute'
20: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
19: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
18: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
17: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/http_command.rb:101:in `block in execute'
16: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
15: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
14: from /usr/local/lib/ruby/gems/2.5.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
13: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/http_command.rb:104:in `block (2 levels) in execute'
12: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/upload.rb:254:in `execute_once'
11: from /usr/local/lib/ruby/gems/2.5.0/gems/google-api-client-0.24.3/lib/google/apis/core/upload.rb:228:in `send_upload_command'
10: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:765:in `post'
9: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:854:in `request'
8: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1104:in `follow_redirect'
7: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1014:in `do_request'
6: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1131:in `protect_keep_alive_disconnected'
5: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1138:in `rescue in protect_keep_alive_disconnected'
4: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1019:in `block in do_request'
3: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient.rb:1242:in `do_get_block'
2: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:177:in `query'
1: from /usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:514:in `query'
/usr/local/lib/ruby/gems/2.5.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:524:in `rescue in query': HTTPClient::KeepAliveDisconnected: Broken pipe (HTTPClient::KeepAliveDisconnected)
Any ideas what’s happening here?
Thanks 😀
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 18 (10 by maintainers)
Commits related to this issue
- Smaller shards for sharded tensors (#20) Summary: google cloud storage seems to prefer files closer to 100 MB than 1 GB and has less errors like https://github.com/googleapis/google-cloud-ruby/issues... — committed to pytorch/torchsnapshot by rllin 2 years ago
- Smaller shards for sharded tensors (#20) Summary: google cloud storage seems to prefer files closer to 100 MB than 1 GB and has less errors like https://github.com/googleapis/google-cloud-ruby/issues... — committed to pytorch/torchsnapshot by rllin 2 years ago
We heard from the GCS team last night. They identified a possible culprit and deployed a fix. My repro script that was raising is now working as well.
We also received confirmation that large uploads of a known size should be sent all at once, and not in chunks. So the current behavior is correct.
It looks like this has been fixed so I will close this issue now. A huge thanks to everyone who participated. Let us know if you have any questions about this.
Here is my repro script using google-api-client instead of google-cloud-storage:
Apologies for being so quiet, I’ve been actively looking at this and don’t have much to share just yet. I have been able to reproduce this, however. Last time I actively worked on uploads we successfully uploaded large files, 5GB+, so this does seem to be a recent change in behavior. I don’t know if this is intentional or not.
The upload implementation has changed a couple of times previously, and we are now wholly reliant on google-api-client for performing uploads. That means my understanding of how uploads are to be managed is out of date. Here is what I can say so far: I am seeing GCS return the
X-Goog-Upload-Chunk-Granularity
header now, and I don’t remember it doing that before. It is possible its been doing that for a long time however, as I’ve not been working on uploads for a while. I don’t know what this header is supposed to indicate to the client, if anything at all, but I think it means that the client should chunk the upload. I don’t see chunking implemented for Resumable Uploads in the google-api-client code. I’ve got a spike that chunks uploads by calculating the multiple of the provided chunk granularity closest to 1GB. I’ve been able to upload a 1.9GB file using those changes. So that seems to be a viable option to fix this, depending on what we hear from folks at Google.Thanks for your patience. Hopefully we will get this resolved soon.