aws-sdk-java-v2: S3: Data read has a different checksum than expected
This might be fixed by 472f5bfbce08038653b641b070e3ac09ae846313, but not sure. I also am seeing some errors where the response.contentLength()
differs from the actual size again, so there might be a race condition again. This is using 2.1.4
+ netty.
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Data read has a different checksum than expected.
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:39)
at software.amazon.awssdk.services.s3.internal.handlers.AsyncChecksumValidationInterceptor.validatePutObjectChecksum(AsyncChecksumValidationInterceptor.java:105)
at software.amazon.awssdk.services.s3.internal.handlers.AsyncChecksumValidationInterceptor.afterUnmarshalling(AsyncChecksumValidationInterceptor.java:91)
at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.lambda$afterUnmarshalling$9(ExecutionInterceptorChain.java:152)
at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.reverseForEach(ExecutionInterceptorChain.java:210)
at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.afterUnmarshalling(ExecutionInterceptorChain.java:152)
at software.amazon.awssdk.core.client.handler.BaseClientHandler.runAfterUnmarshallingInterceptors(BaseClientHandler.java:120)
at software.amazon.awssdk.core.client.handler.BaseClientHandler.lambda$interceptorCalling$2(BaseClientHandler.java:133)
at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:40)
at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:28)
at software.amazon.awssdk.core.internal.http.async.SyncResponseHandlerAdapter.lambda$prepare$0(SyncResponseHandlerAdapter.java:85)
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 2
- Comments: 55 (24 by maintainers)
Commits related to this issue
- use the multipart upload for all the files, since the normal one doesn't work: https://github.com/aws/aws-sdk-java-v2/issues/953 — committed to albumprinter/junit-synnefo by deleted user 5 years ago
- use the sync s3 client because of https://github.com/aws/aws-sdk-java-v2/issues/953 — committed to albumprinter/junit-synnefo by deleted user 5 years ago
- Use async apis (#10) - Use async apis instead of sync apis for everything but the S3 download jobs because of aws/aws-sdk-java-v2#953 - Actually make the `threads` setting work: now we can limit t... — committed to albumprinter/junit-synnefo by derwasp 5 years ago
- VTKU-127 : Yritä kirjeiden julkaisuja uusiksi Jos tapahtuu virhe, yritetään korkeintaan viisi kertaa uusiksi exponential backoffilla. Ks. toisen ongelmatyypin keskustelu GitHubissa https://github.c... — committed to Opetushallitus/viestintapalvelu by timorantalaiho 5 years ago
- VTKU-127 : Yritä kirjeiden julkaisuja uusiksi Jos tapahtuu virhe, yritetään korkeintaan viisi kertaa uusiksi exponential backoffilla. Ks. toisen ongelmatyypin keskustelu GitHubissa https://github.c... — committed to Opetushallitus/viestintapalvelu by timorantalaiho 5 years ago
- VTKU-127 : Yritä kirjeiden julkaisuja uusiksi Jos tapahtuu virhe, yritetään korkeintaan viisi kertaa uusiksi exponential backoffilla. Ks. toisen ongelmatyypin keskustelu GitHubissa https://github.c... — committed to Opetushallitus/viestintapalvelu by timorantalaiho 5 years ago
- VTKU-127 : Yritä kirjeiden julkaisuja uusiksi Jos tapahtuu virhe, yritetään korkeintaan viisi kertaa uusiksi exponential backoffilla. Ks. toisen ongelmatyypin keskustelu GitHubissa https://github.c... — committed to Opetushallitus/viestintapalvelu by timorantalaiho 5 years ago
- Fixed an issue where streaming writes could be misordered. This fixes a cause of "S3: Data read has a different checksum than expected": https://github.com/aws/aws-sdk-java-v2/issues/953 — committed to aws/aws-sdk-java-v2 by millems 4 years ago
- Fixed an issue where streaming writes could be misordered. This fixes a cause of "S3: Data read has a different checksum than expected": https://github.com/aws/aws-sdk-java-v2/issues/953 — committed to aws/aws-sdk-java-v2 by millems 4 years ago
- Merge pull request #953 from aws/staging/a7058983-8371-4514-a451-f9d527850d2c Pull request: release <- staging/a7058983-8371-4514-a451-f9d527850d2c — committed to aws/aws-sdk-java-v2 by aws-sdk-java-automation 4 years ago
- Fix SDK behavior when request content-length does not match the data length returned by the publisher. This fixes two potential bugs: 1. A source of "checksum mismatch" exceptions (#953) when the pub... — committed to aws/aws-sdk-java-v2 by millems 3 years ago
- Fix SDK behavior when request content-length does not match the data length returned by the publisher. This fixes two potential bugs: 1. A source of "checksum mismatch" exceptions (#953) when the pub... — committed to aws/aws-sdk-java-v2 by millems 3 years ago
- Fix SDK behavior when request content-length does not match the data length returned by the publisher. (#2788) This fixes two potential bugs: 1. A source of "checksum mismatch" exceptions (#953) whe... — committed to aws/aws-sdk-java-v2 by millems 3 years ago
Hi. Unfortunately, I am still able to reproduce the issue on the SDK version 2.15.11. I prepared a demo project to show the issue.
The project has 3 tests. First to upload ~100kB array with checksum validation. This is the only green test in the scope. Both other tests upload ~1mB array with and without checksum validation. In the test with checksum validation, I get mentioned “Data read has a different checksum than expected” error. Without checksum validation bytes are uploaded to s3 but they are different from the original byte array.
Looks like the packets still be reordered as @marc-christian-schulze mentioned.
FWIW, we’ve encountered this issue too when calling
getObject
, which was due toS3AsyncClient.clientConfiguration
having a duplicated set of default interceptors.Apparently there can’t be more than one
AsyncChecksumValidationInterceptor
in the chain, as it will remove the checksum part and break the second check, which is why all those exceptions were expecting 0.In
DefaultS3BaseClientBuilder
, it callsfinalizeServiceConfiguration
when building the client:ClasspathInterceptorChainFactory.getInterceptors
uses its classloader’sgetResources
to fetch a list ofExecutionInterceptor
, and for some reason it returned duplicated set of interceptors.This was verified in our side by calling
Which prints out two same lines.
Looks like it’s a not-so-common bug, we had our servlet running as
ROOT
webapp in tomcat, renamingROOT
to something else fixed it.However I would suggest changing the behavior of
ClasspathInterceptorChainFactory.createExecutionInterceptorsFromClasspath
, to check for duplication before returning. @millemsEncountering this with 2.10.24. I’m doing PutObject with async client, very similar to code posted by PyvesB and tim-fdc.
I’m able to reproduce this consistently by looping over the async put code 10 times with a 5MB file. The first few succeed, but then around 3-5 will fail with the “Data read has a different checksum than expected.” exception. However, all 10 files appear to upload correctly without corruption.
I tried with a much smaller file (10KB), and was able to do 100 loops without encountering the issue.
A bit of my test code:
Log output attached. Despite the exceptions, all files do upload successfully and do not appear to be corrupted (my test files are images, and I’m able to download and view them).
FWIW, once this exception occurs, all subsequent puts in the loop fail with the same message.
issue-953.log
Still have this issue on “software.amazon.awssdk:s3:2.5.10” version:
Code example:
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Data read has a different checksum than expected. Was 1229635159, but expected 0
Any suggestions on how to fix?
Added a fix for the repro from me above with #2788. Will keep looking into the repro from burnout. Theirs is a complicated repro that involves proxying requests to a mock S3…
I’ve created the following reproducer: https://github.com/marc-christian-schulze/kotlinx-reproducer-2109
It contains 3 test cases. 2 of them do pass and show that functionally speaking the code seems to be correct. The third test case however, shows the race-condition.
@PyvesB Awesome, thanks! There’s one more change in the pipe, but it only applies to buckets with server-side encryption enabled, and no server-side encryption parameters in the request.
If you don’t have buckets with server-side encryption enabled, you should theoretically not see any errors, unless there’s more edge cases lurking.
Thanks to @mar-kolya we found one cause of this for
PutObject
. If the upload request is retried by the SDK, the second attempt will falsely detect an invalid checksum, even though the request actually succeeded. This is because it was calculating the checksum twice, and comparing against the service which only calculated it once.This would also explain why the reports are that the file are okay - they are.
I’ve continued @mar-kolya’s work in https://github.com/aws/aws-sdk-java-v2/pull/1550 by adding regression tests to make sure that we’re always resetting the MD5 calculation on retries: https://github.com/aws/aws-sdk-java-v2/pull/1552
Hopefully we can get this change out in the next few days.
We’re getting hit by this at Datadog as well.
Thanks for the updated reports. We’ll look back into this issue as soon as we can.
Hi, also getting this error for an async client only.
Using version
software.amazon.awssdk:aws-sdk-java:2.5.15
.Error happens consistently using the following client (Kotlin code)
Writing an object as follows:
Using Kotlin Coroutines extensions (
await()
) to transform Java CompletableFuture to Coroutine. Works for listings. I am also using the sessionToken. Seems to work without a sessionToken.For non async client
it works.
Here is the exception:
Any ideas?
Thanks! Tim