aws-sdk-java-v2: S3: Data read has a different checksum than expected

This might be fixed by 472f5bfbce08038653b641b070e3ac09ae846313, but not sure. I also am seeing some errors where the response.contentLength() differs from the actual size again, so there might be a race condition again. This is using 2.1.4 + netty.

Caused by: software.amazon.awssdk.core.exception.SdkClientException: Data read has a different checksum than expected.
        at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
        at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:39)
        at software.amazon.awssdk.services.s3.internal.handlers.AsyncChecksumValidationInterceptor.validatePutObjectChecksum(AsyncChecksumValidationInterceptor.java:105)
        at software.amazon.awssdk.services.s3.internal.handlers.AsyncChecksumValidationInterceptor.afterUnmarshalling(AsyncChecksumValidationInterceptor.java:91)
        at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.lambda$afterUnmarshalling$9(ExecutionInterceptorChain.java:152)
        at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.reverseForEach(ExecutionInterceptorChain.java:210)
        at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.afterUnmarshalling(ExecutionInterceptorChain.java:152)
        at software.amazon.awssdk.core.client.handler.BaseClientHandler.runAfterUnmarshallingInterceptors(BaseClientHandler.java:120)
        at software.amazon.awssdk.core.client.handler.BaseClientHandler.lambda$interceptorCalling$2(BaseClientHandler.java:133)
        at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:40)
        at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:28)
        at software.amazon.awssdk.core.internal.http.async.SyncResponseHandlerAdapter.lambda$prepare$0(SyncResponseHandlerAdapter.java:85)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 2
  • Comments: 55 (24 by maintainers)

Commits related to this issue

Most upvoted comments

Hi. Unfortunately, I am still able to reproduce the issue on the SDK version 2.15.11. I prepared a demo project to show the issue.

The project has 3 tests. First to upload ~100kB array with checksum validation. This is the only green test in the scope. Both other tests upload ~1mB array with and without checksum validation. In the test with checksum validation, I get mentioned “Data read has a different checksum than expected” error. Without checksum validation bytes are uploaded to s3 but they are different from the original byte array.

Looks like the packets still be reordered as @marc-christian-schulze mentioned.

FWIW, we’ve encountered this issue too when calling getObject, which was due to S3AsyncClient.clientConfiguration having a duplicated set of default interceptors.

Apparently there can’t be more than one AsyncChecksumValidationInterceptor in the chain, as it will remove the checksum part and break the second check, which is why all those exceptions were expecting 0.

In DefaultS3BaseClientBuilder, it calls finalizeServiceConfiguration when building the client:

ClasspathInterceptorChainFactory interceptorFactory = new ClasspathInterceptorChainFactory();
List<ExecutionInterceptor> interceptors = interceptorFactory
        .getInterceptors("software/amazon/awssdk/services/s3/execution.interceptors");
interceptors = CollectionUtils.mergeLists(interceptors, config.option(SdkClientOption.EXECUTION_INTERCEPTORS));
return config.toBuilder().option(SdkClientOption.EXECUTION_INTERCEPTORS, interceptors).build();

ClasspathInterceptorChainFactory.getInterceptors uses its classloader’s getResources to fetch a list of ExecutionInterceptor, and for some reason it returned duplicated set of interceptors.

This was verified in our side by calling

Collections.list(
    new ClasspathInterceptorChainFactory().getClass().getClassLoader()
          .getResources("software/amazon/awssdk/services/s3/execution.interceptors")
).forEach(System.out::println);

Which prints out two same lines.

Looks like it’s a not-so-common bug, we had our servlet running as ROOT webapp in tomcat, renaming ROOT to something else fixed it.

However I would suggest changing the behavior of ClasspathInterceptorChainFactory.createExecutionInterceptorsFromClasspath, to check for duplication before returning. @millems

Encountering this with 2.10.24. I’m doing PutObject with async client, very similar to code posted by PyvesB and tim-fdc.

I’m able to reproduce this consistently by looping over the async put code 10 times with a 5MB file. The first few succeed, but then around 3-5 will fail with the “Data read has a different checksum than expected.” exception. However, all 10 files appear to upload correctly without corruption.

I tried with a much smaller file (10KB), and was able to do 100 loops without encountering the issue.

A bit of my test code:

S3AsyncClient s3Client = S3AsyncClient.builder().region(config.getRegion()).build();

String s3KeyPrefix = "test/";

AtomicInteger success = new AtomicInteger();
AtomicInteger failure = new AtomicInteger();

// Load this however
byte[] imageBytes = ...;

// We detect this
String contentType = ...;

int puts = 10;
List<CompletableFuture<?>> futures = new ArrayList<>();
for(int i = 0; i < puts; i++) {
    String s3Key = s3KeyPrefix + UUID.randomUUID();
    logger.debug("{}: Start", s3Key);
    PutObjectRequest request = PutObjectRequest.builder()
        .bucket(config.getBucket()) // Bucket name has only letters and dashes
        .key(s3Key)
        .contentType(contentType)
        .build();
    futures.add(s3Client.putObject(request, AsyncRequestBody.fromBytes(imageBytes)).handle((r, t) -> {
        if(t != null) {
            failure.incrementAndGet();
            logger.error("{}: Something went wrong", s3Key, t);
        } else {
            success.incrementAndGet();
            logger.debug("{}: {}", s3Key, r);
        }
        return null;
    }));
}

logger.debug("waiting");

CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();

logger.debug("success={},failure={}", success, failure);
logger.debug("done");

Log output attached. Despite the exceptions, all files do upload successfully and do not appear to be corrupted (my test files are images, and I’m able to download and view them).

FWIW, once this exception occurs, all subsequent puts in the loop fail with the same message.

issue-953.log

Still have this issue on “software.amazon.awssdk:s3:2.5.10” version:

Code example:

... this.client = S3Client.create();

        client.getObject(
                GetObjectRequest.builder().bucket(bucket).key(key).build(),
                ResponseTransformer.toFile(download)
        );

Caused by: software.amazon.awssdk.core.exception.SdkClientException: Data read has a different checksum than expected. Was 1229635159, but expected 0

Any suggestions on how to fix?

Added a fix for the repro from me above with #2788. Will keep looking into the repro from burnout. Theirs is a complicated repro that involves proxying requests to a mock S3…

I’ve created the following reproducer: https://github.com/marc-christian-schulze/kotlinx-reproducer-2109

It contains 3 test cases. 2 of them do pass and show that functionally speaking the code seems to be correct. The third test case however, shows the race-condition.

We’ve just updated to the newly released version of the SDK (2.10.39). We’ll let you know if we see any more of these checksum failures. Using 2.10.39 and seeing Exception: Unable to unmarshall response (Data read has a different checksum than expected. Was 0x61dbb5e075a0623141a1598ed637a3fb, but expected 0x8c390228c2e2bfe6efc2de5db433bf17). Response Code: 200, Response Text: OK software.amazon.awssdk.core.exception.SdkClientException: Unable to unmarshall response (Data read has a different checksum than expected. Was 0x61dbb5e075a0623141a1598ed637a3fb, but expected 0x8c390228c2e2bfe6efc2de5db433bf17). Response Code: 200, Response Text: OK

@PyvesB Awesome, thanks! There’s one more change in the pipe, but it only applies to buckets with server-side encryption enabled, and no server-side encryption parameters in the request.

If you don’t have buckets with server-side encryption enabled, you should theoretically not see any errors, unless there’s more edge cases lurking.

Thanks to @mar-kolya we found one cause of this for PutObject. If the upload request is retried by the SDK, the second attempt will falsely detect an invalid checksum, even though the request actually succeeded. This is because it was calculating the checksum twice, and comparing against the service which only calculated it once.

This would also explain why the reports are that the file are okay - they are.

I’ve continued @mar-kolya’s work in https://github.com/aws/aws-sdk-java-v2/pull/1550 by adding regression tests to make sure that we’re always resetting the MD5 calculation on retries: https://github.com/aws/aws-sdk-java-v2/pull/1552

Hopefully we can get this change out in the next few days.

We’re getting hit by this at Datadog as well.

Thanks for the updated reports. We’ll look back into this issue as soon as we can.

Hi, also getting this error for an async client only.

Using version software.amazon.awssdk:aws-sdk-java:2.5.15.

Error happens consistently using the following client (Kotlin code)

class S3CredentialsProvider : AwsCredentialsProvider {
    override fun resolveCredentials(): AwsCredentials {
        val awsCredentialsProcessBuilder = AwsSessionCredentials.create(
                "<SECRET>",
                "<SECRET>",
                "<SECRET>")

        return awsCredentialsProcessBuilder
    }
}
 
   val s3Client = S3AsyncClient.builder()
            .credentialsProvider(S3CredentialsProvider())
            .region(Region.EU_WEST_1)
            .build()

Writing an object as follows:

    val toByteArray = "test".toByteArray()
    val asyncRequestBody = AsyncRequestBody.fromBytes(toByteArray)
    s3Client.putObject(
            PutObjectRequest.builder()
                    .bucket(config.bucket)
                    .key("test")
                    .build(),
            asyncRequestBody).await()

Using Kotlin Coroutines extensions (await()) to transform Java CompletableFuture to Coroutine. Works for listings. I am also using the sessionToken. Seems to work without a sessionToken.

For non async client

S3Client.builder()
            .region(Region.EU_WEST_1)
            .credentialsProvider(S3CredentialsProvider())
            .build()

it works.

Here is the exception:

Exception in thread "main" software.amazon.awssdk.core.exception.SdkClientException: Data read has a different checksum than expected.
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:39)
	at software.amazon.awssdk.services.s3.checksums.ChecksumsEnabledValidator.validatePutObjectChecksum(ChecksumsEnabledValidator.java:134)
	at software.amazon.awssdk.services.s3.internal.handlers.AsyncChecksumValidationInterceptor.afterUnmarshalling(AsyncChecksumValidationInterceptor.java:86)
	at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.lambda$afterUnmarshalling$9(ExecutionInterceptorChain.java:152)
	at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.reverseForEach(ExecutionInterceptorChain.java:210)
	at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.afterUnmarshalling(ExecutionInterceptorChain.java:152)
	at software.amazon.awssdk.core.client.handler.BaseClientHandler.runAfterUnmarshallingInterceptors(BaseClientHandler.java:138)
	at software.amazon.awssdk.core.client.handler.BaseClientHandler.lambda$interceptorCalling$2(BaseClientHandler.java:151)
	at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:40)
	at software.amazon.awssdk.core.client.handler.AttachHttpMetadataResponseHandler.handle(AttachHttpMetadataResponseHandler.java:28)
	at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.lambda$prepare$0(AsyncResponseHandler.java:88)
	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
	at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onComplete(AsyncResponseHandler.java:129)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$FullResponseContentPublisher$1.request(ResponseHandler.java:369)
	at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onSubscribe(AsyncResponseHandler.java:108)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$FullResponseContentPublisher.subscribe(ResponseHandler.java:360)
	at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.onStream(AsyncResponseHandler.java:71)
	at software.amazon.awssdk.core.internal.http.async.AsyncAfterTransmissionInterceptorCallingResponseHandler.onStream(AsyncAfterTransmissionInterceptorCallingResponseHandler.java:86)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onStream(MakeAsyncHttpRequestStage.java:249)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.channelRead0(ResponseHandler.java:112)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.channelRead0(ResponseHandler.java:65)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at com.typesafe.netty.http.HttpStreamsHandler.channelRead(HttpStreamsHandler.java:129)
	at com.typesafe.netty.http.HttpStreamsClientHandler.channelRead(HttpStreamsClientHandler.java:148)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at software.amazon.awssdk.http.nio.netty.internal.FutureCancelHandler.channelRead0(FutureCancelHandler.java:42)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:677)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:612)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:529)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:491)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
	at java.lang.Thread.run(Thread.java:748)

Any ideas?

Thanks! Tim