aws-sdk-java-v2: Rapid S3 downloads causes failure

I have code using the Java SDK to download a few hundred photos from an S3 bucket. The first bunch work but things start to fail around photo number 100. It does not always fail on the same file. It seems very similar to #452 though that issue is closed. Note that I’m using S3AsyncClient but downloading the file sequentially, never downloading more than one concurrently.

The photos are about 500KB each.

Expected Behavior

I should be able to download all the objects in my bucket

Current Behavior

Am able to download many files but after about 100 requests star to fail. The exception is:

java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException
	at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:61)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:50)
	at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
	at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryErrorIfNeeded(AsyncRetryableStage.java:167)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:118)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.lambda$execute$0(AsyncRetryableStage.java:103)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onError(MakeAsyncHttpRequestStage.java:241)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.lambda$notifyIfResponseNotCompleted$3(ResponseHandler.java:398)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:179)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:398)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.channelInactive(ResponseHandler.java:148)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:167)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282)
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
	at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1054)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:826)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:495)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
	at java.lang.Thread.run(Thread.java:748)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: null
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
	at software.amazon.awssdk.core.internal.util.ThrowableUtils.asSdkException(ThrowableUtils.java:98)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:117)
	... 63 common frames omitted
Caused by: java.io.IOException: Server failed to send complete response
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:397)
	... 54 common frames omitted

With full debug logs enabled there’s a ton of output but the response from the Netty network logs from right before things crash is:

11:16:18.311 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 - R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] READ: DefaultLastHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 254, cap: 254/254, unwrapped: PooledUnsafeDirectByteBuf(ridx: 8446, widx: 8446, cap: 8475)), decoderResult: success), 254B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 5d 52 e1 13 91 3f b5 10 51 7b 34 f6 79 a0 39 aa |]R...?..Q{4.y.9.|
|00000010| 3a 4d fc ce 3c 68 3b 15 b5 ad 02 63 d4 f7 a0 78 |:M..<h;....c...x|
|00000020| 92 29 39 d5 39 01 a9 44 b9 c6 33 2a 8d 7e 91 f3 |.)9.9..D..3*.~..|
|00000030| 28 ea 71 58 a6 09 25 58 2f a8 4e 94 f1 24 02 65 |(.qX..%X/.N..$.e|
|00000040| a7 b5 48 11 7c 6a 95 05 d2 a8 b0 93 a1 02 d8 57 |..H.|j.........W|
|00000050| 11 04 21 ad 32 b4 2d 94 16 99 af 0e f4 00 41 2a |..!.2.-.......A*|
|00000060| 80 04 55 04 32 94 41 38 93 40 01 11 25 02 94 89 |..U.2.A8.@..%...|
|00000070| 1a 39 2a 2d 84 b0 04 12 d7 2a cd 22 60 a7 34 27 |.9*-.....*."`.4'|
|00000080| 40 94 0f 0d 0d 53 0d 85 38 5d bb 24 ea af 94 32 |@....S..8].$...2|
|00000090| 84 80 24 ad 3a 63 0c e8 f8 77 85 44 8a a0 14 1a |..$.:c...w.D....|
|000000a0| 02 74 27 51 ab e6 25 02 c6 4e a8 ed 4e 64 68 4a |.t'Q..%..N..NdhJ|
|000000b0| 4e 84 09 cf 8c 7a 0e f4 65 4c f8 2b 00 28 14 f1 |N....z..eL.+.(..|
|000000c0| 98 f0 54 0a a2 a0 24 9b 48 42 2e ed 20 ad ae 53 |..T...$.HB.. ..S|
|000000d0| a8 f0 12 a9 80 9a 05 11 a3 53 2a 72 4f 3d 70 ae |.........S*rO=p.|
|000000e0| 07 9a 61 20 27 e7 8c 27 27 2d 6a a3 ff d9 f2 d3 |..a '..''-j.....|
|000000f0| 22 9e 33 37 d2 2e 06 03 e0 06 a6 fc 72 02       |".37........r.  |
+--------+-------------------------------------------------+----------------+
11:16:18.312 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] READ COMPLETE
11:16:18.312 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] INACTIVE
11:16:18.316 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] CLOSE
11:16:18.316 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] UNREGISTERED

Steps to Reproduce (for bugs)

Here is a slightly modified version of the Kotlin code that produces the above (modified to remove company sensitive stuff):

object Main {
    val log = getLogger { }

    val s3AsyncClient = S3AsyncClient.builder()
        .credentialsProvider(DefaultCredentialsProvider.create())
        .region(Region.US_WEST_2)
        // retries disable here because we can't retry in our situation - the real use case is to stream
        // data from a server to a customer. Note that you get the same failures even with retries -- it
        // just takes longer
        .overrideConfiguration { config ->
            config.retryPolicy(RetryPolicy.none()).build()
        }
        .build()

    val daBucket = "com-revl-revlx"

    fun doDownload(key: String) {
        val outFile = Paths.get("/tmp/experiment/").resolve(key)
        outFile.parent.toFile().mkdirs()
        if (outFile.toFile().exists()) {
            outFile.toFile().delete()
        }
        log.info("Downloading via toFile: {}", key)
        val request = GetObjectRequest.builder()
            .bucket(daBucket)
            .key(key)
            .build()

        val future = s3AsyncClient.getObject(request, AsyncResponseTransformer.toFile(outFile))

        try {
            future.join()
            log.info("Download of {} succeeded", key)
        } catch (t: Throwable) {
            log.error("Download of {} failed with:", key, t)
            throw t
        }
    }

    @JvmStatic
    fun main(argv: Array<String>) {
        ThreadUtils.setLoggingDefaultExceptionHandler()
        val keys = (1..240).toList()

        var numDone = 1
        for (i in keys) {
            log.info("Process file number {}", numDone)
            ++numDone
            val key = String.format("path/to/photo_%05d.jpg", i)
            doDownload(key)
        }
    }
}

Context

We provide photos to our customers and are trying to provide a “download all” function. To do this we need to stream the data from S3 through some code that incrementally adds it to a zip file which is provided to the customer. Since there’s so much data we can’t hold it all in RAM so we need to do this async-streaming processing. We have our own AsyncResponseTransformer which does this but wanted to provide an example with as little of our code as possible. AsyncResponseTransformer.toFile and our code experience the same error.

Your Environment

  • AWS Java SDK version used: 2.5.4
  • JDK version used: Kotlin 1.2.71 running on Java 8 JVM
  • Operating System and version: have reproduced on Debian Linux Stretch and OSX

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 50 (24 by maintainers)

Commits related to this issue

Most upvoted comments

Hi @jaylynstoesz, I see you’re using SQS so can you please open a new github issue, with a sample code we can use to repro? Setting connectionMaxIdleTime to 5 seconds is a suggestion tailored for S3 clients.

I’m seeing this issue in the latest version of the SDK. I’m reading from SQS via Alpakka using the async client. I tried setting the connectionMaxIdleTime to 5 seconds and I’m still getting the same error. Kotlin: 1.3.41 Netty & SQS Java SDK: 2.9.15

java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException
	at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:61)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51)
	at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
	at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:75)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryErrorIfNeeded(AsyncRetryableStage.java:175)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:126)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.lambda$execute$0(AsyncRetryableStage.java:107)
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onError(MakeAsyncHttpRequestStage.java:249)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.lambda$notifyIfResponseNotCompleted$2(ResponseHandler.java:397)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:180)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:397)
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.channelInactive(ResponseHandler.java:148)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
	at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:167)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)	
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
	at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1078)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
	at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:278)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:827)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:405)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.lang.Thread.run(Thread.java:748)
Caused by: software.amazon.awssdk.core.exception.SdkClientException
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
	at software.amazon.awssdk.core.internal.util.ThrowableUtils.asSdkException(ThrowableUtils.java:98)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:125)
	... 52 more
Caused by: java.io.IOException: Server failed to send complete response
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:396)
	... 43 more

Update: unfortunately, it will take longer time than expected to fix this issue.

UPDATE: Please note that using HTTP is not as secure as HTTPS and is vulnerable to security attacks https://en.wikipedia.org/wiki/HTTPS#Difference_from_HTTP. We do not recommend using it.

Hi @pkgonan. For error messages that include Channel was closed before it could be written to specifically, it is likely related to a pending fix that is currently being worked on in https://github.com/aws/aws-sdk-java-v2/pull/2883 (and I believe this is unrelated to the issue that this thread was originally created for). Please follow that PR and let us know if it resolves your issue once it is released.

We are experiencing a similar issue, but with DynamoDB:

  • we execute concurrently about 1000 writes & reads /second to a dynamo table, per instance of the application.

The application is a worker, consuming from kinesis, using KCL, and reading and writing to dynamo.

  • Note: we tried with KCL v1 and v2, and we don’t know why, but the bug happens more quickly when we use the KCL v2. The application is written in kotlin 1.3.21 and uses coroutines 1.1.1. We adapt the reactive-streams from aws-sdk-java-v2 using the default kotlin-coroutines adapter.

The stacktrace when using aws-sdk-java-v2 2.5.11: Note: when we tried the version aws sdk 2.5.25, the exception is not thrown, but the code is stuck forever trying to read/write from dynamo (we call .await() in a GlobalScope.async {}, waiting for dynamo complete).

software.amazon.awssdk.core.exception.SdkClientException: null
    at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
    at software.amazon.awssdk.core.internal.util.ThrowableUtils.asSdkException(ThrowableUtils.java:98)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:123)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.lambda$execute$0(AsyncRetryableStage.java:105)
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
    at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onError(MakeAsyncHttpRequestStage.java:241)
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.handleFailure(NettyRequestExecutor.java:237)
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.lambda$writeRequest$6(NettyRequestExecutor.java:191)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
    at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:57)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:31)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:991)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:873)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1365)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.timeout.WriteTimeoutHandler.write(WriteTimeoutHandler.java:113)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.ssl.SslHandler.finishWrap(SslHandler.java:894)
    at io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:880)
    at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:797)
    at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:778)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.flush(CombinedChannelDuplexHandler.java:533)
    at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
    at io.netty.channel.CombinedChannelDuplexHandler.flush(CombinedChannelDuplexHandler.java:358)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
    at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
    at io.netty.handler.logging.LoggingHandler.flush(LoggingHandler.java:265)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:802)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:814)
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:794)
    at com.typesafe.netty.http.HttpStreamsHandler.completeBody(HttpStreamsHandler.java:298)
    at com.typesafe.netty.http.HttpStreamsHandler.access$400(HttpStreamsHandler.java:14)
    at com.typesafe.netty.http.HttpStreamsHandler$3$1.run(HttpStreamsHandler.java:276)
    at com.typesafe.netty.http.HttpStreamsHandler.executeInEventLoop(HttpStreamsHandler.java:342)
    at com.typesafe.netty.http.HttpStreamsHandler.access$300(HttpStreamsHandler.java:14)
    at com.typesafe.netty.http.HttpStreamsHandler$3.complete(HttpStreamsHandler.java:273)
    at com.typesafe.netty.HandlerSubscriber$3.operationComplete(HandlerSubscriber.java:244)
    at com.typesafe.netty.HandlerSubscriber$3.operationComplete(HandlerSubscriber.java:241)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
    at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:57)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:31)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
    at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:57)
    at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:31)
    at io.netty.channel.AbstractCoalescingBufferQueue.releaseAndCompleteAll(AbstractCoalescingBufferQueue.java:340)
    at io.netty.channel.AbstractCoalescingBufferQueue.releaseAndFailAll(AbstractCoalescingBufferQueue.java:207)
    at io.netty.handler.ssl.SslHandler.releaseAndFailAll(SslHandler.java:1591)
    at io.netty.handler.ssl.SslHandler.access$900(SslHandler.java:166)
    at io.netty.handler.ssl.SslHandler$5.run(SslHandler.java:1785)
    at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
    at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:495)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
    at java.lang.Thread.run(Thread.java:748)
java.nio.channels.ClosedChannelException: null
    at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)

Are these issues related? I’m thinking in use this lib: https://github.com/bradynpoulsen/aws-sdk-kotlin-coroutines/, that is a wrapper for the aws-sdk-java v1, but with support to coroutines out of the box. It should be less efficient, but it may work.

I tried to use http as the protocol in dynamo endpoint and also set connectionMaxIdleTime to 5 seconds, without success. The problem happens after some minutes that the application is running when using with KCL v2, and after a greater period when using KCL v1.

I think we’ve experienced this same behavior before, with s3, but we change to use s3 from aws v1, and the issue has gone away. Now, we are experiencing the issue with dynamo.

Environment:

  • Java 8, kotlin 1.3.21, coroutines 1.1.1.
  • Linux debian (from openjdk:8u162-jdk-slim docker image)
  • aws-sdk-java-v2: 2.5.11 & 2.5.25 (the later without the exception, but the code is stuck forever).