google-cloud-go: [storage] "stream error: stream ID x; INTERNAL_ERROR"
We are reading tens of millions of objects from GCS and seem to be hitting an issue where an error "stream error: stream ID 4163; INTERNAL_ERROR"
is returned after processing files for a while.
It’s pretty hard to debug the issue, as it takes several hours before the issue occurs, but we’ve had the issue two consecutive times in a row now.
We are using version eaddaf6dd7ee35fd3c2420c8d27478db176b0485
of the storage package.
Here’s the pseudo code of what we are doing:
cs, err := cloudstorage.NewClient(ctx)
// err...
defer cs.Close()
b := cs.Bucket(...)
q := &storage.Query{Prefix: ...}
it := b.Objects(ctx, q)
for {
a, err := it.Next()
if err == iterator.Done {
break
}
handleObject(...)
}
We have retry logic built into the handleObject
function, but even retrying doesn’t help. Also, once the error shows up, it doesn’t go away anymore, reading of all lines and files now return the same error.
We’re thinking of building some retry logic around the client itself, closing it and opening a new one to see if that works, and we’re still digging deeper, but I wanted to report this nonetheless, in case anyone else has also run into this.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 27 (9 by maintainers)
Commits related to this issue
- storage: retry reads If we get a retryable error when reading an object, issue another request for the object's data, using a range that begins where we stopped reading. For now, the only retryable ... — committed to googleapis/google-cloud-go by jba 6 years ago
- Merge #24880 #24896 24880: storageccl: pull in fix for reading GCS files r=mjibson a=mjibson The linked issue lists a commit attempting to fix occasional problems we see during large GCS reads. The... — committed to cockroachdb/cockroach by deleted user 6 years ago
- cloud/gcp: add custom retryer for gcs storage, retry on stream INTERNAL_ERROR Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Create a ... — committed to rhu713/cockroach by rhu713 2 years ago
- cloud/gcp: add custom retryer for gcs storage, retry on stream INTERNAL_ERROR Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Create a ... — committed to rhu713/cockroach by rhu713 2 years ago
- Merge #84975 #85017 #85024 #85069 #85100 #85146 #85234 #85325 #85327 #85329 84975: storage: add `MVCCRangeKeyStack` for range keys r=nicktrav,jbowens a=erikgrinaker **storage: add `MVCCRangeKeyStack... — committed to cockroachdb/cockroach by deleted user 2 years ago
- cloud: add stream INTERNAL_ERROR to resumable HTTP errors Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Retry these errors assuggeste... — committed to rhu713/cockroach by rhu713 2 years ago
- cloud: add stream INTERNAL_ERROR to resumable HTTP errors Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Retry these errors assuggeste... — committed to rhu713/cockroach by rhu713 2 years ago
- release-22.1: vendor: bump cloud.google.com/go/storage from v18.2.0 to v1.21.0 This commit bumps the `cloud.google.com/go/storage` vendor to include the ability to inject custom retry functions when ... — committed to adityamaru/cockroach by adityamaru 2 years ago
- release-22.1: vendor: bump cloud.google.com/go/storage from v18.2.0 to v1.21.0 This commit bumps the `cloud.google.com/go/storage` vendor to include the ability to inject custom retry functions when ... — committed to adityamaru/cockroach by adityamaru 2 years ago
- cloud/gcp: add custom retryer for gcs storage, retry on stream INTERNAL_ERROR Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Create a ... — committed to rhu713/cockroach by rhu713 2 years ago
Thanks for the update, @mjibson. Closing this (woot).
@rayrutjes, if you’re still having problems writing, please open another issue.
We have not seen this error again since the patch (2 weeks). But it occurred rarely enough that I’m still going to give it another 2 weeks before being convinced.
We’ve pulled in that commit to cockroach. We run some large nightly tests that were failing a few times per week from this bug. I’ll report back new results in a while.
@jba, related to a discussion I had with @ mikeyu83 the other day, the Go Cloud Storage library should probably try to cut large/failed (or just failed) transfers up into multiple HTTP requests, stitched together an io.Reader for the user concatenated from responses from multiple HTTP Range requests.
We are occasionally getting this error with Go 1.20. We are reading binary audio (<10MB/req) using http.Server and ran into this error a couple of times.
Edit: Downgrade to Go 1.19 did not fix the issue, also bumping the http.Server.ReadTimeout had no effect.
We got a lot of this errors with Go SDK 1.20/1.20.1, and did not have this with Go 1.95. Anyone experiencing the same issue?
I see this issues during downloading files from ipfs web client (ipfs.io) with go http client. Ipfs is a p2p software written in go. download speed will change frequently because It’s a p2p network. I think that’s a cause of this problem
BTW, the fix suggests a change only on the reading side. In our case we see similar issues from time to time on the writing side:
Post https://www.googleapis.com/upload/storage/v1/b/***/o?alt=json&projection=full&uploadType=multipart: stream error: stream ID 17; INTERNAL_ERROR
Is this somehow related or should I open a new issue?
You may need larger files? This reproduced for us again last night. We have a test that reads some large (12G) files from GCS (not in parallel with any other reads, only one go routine reading) and we got this error again.