aws-sdk-java-v2: Rapid S3 downloads causes failure
I have code using the Java SDK to download a few hundred photos from an S3 bucket. The first bunch work but things start to fail around photo number 100. It does not always fail on the same file. It seems very similar to #452 though that issue is closed. Note that I’m using S3AsyncClient
but downloading the file sequentially, never downloading more than one concurrently.
The photos are about 500KB each.
Expected Behavior
I should be able to download all the objects in my bucket
Current Behavior
Am able to download many files but after about 100 requests star to fail. The exception is:
java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException
at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:61)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:50)
at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryErrorIfNeeded(AsyncRetryableStage.java:167)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:118)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.lambda$execute$0(AsyncRetryableStage.java:103)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onError(MakeAsyncHttpRequestStage.java:241)
at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.lambda$notifyIfResponseNotCompleted$3(ResponseHandler.java:398)
at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:179)
at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:398)
at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.channelInactive(ResponseHandler.java:148)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:167)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282)
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355)
at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1054)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1403)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:826)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:495)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
at java.lang.Thread.run(Thread.java:748)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: null
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97)
at software.amazon.awssdk.core.internal.util.ThrowableUtils.asSdkException(ThrowableUtils.java:98)
at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:117)
... 63 common frames omitted
Caused by: java.io.IOException: Server failed to send complete response
at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.notifyIfResponseNotCompleted(ResponseHandler.java:397)
... 54 common frames omitted
With full debug logs enabled there’s a ton of output but the response from the Netty network logs from right before things crash is:
11:16:18.311 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 - R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] READ: DefaultLastHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 254, cap: 254/254, unwrapped: PooledUnsafeDirectByteBuf(ridx: 8446, widx: 8446, cap: 8475)), decoderResult: success), 254B
+-------------------------------------------------+
| 0 1 2 3 4 5 6 7 8 9 a b c d e f |
+--------+-------------------------------------------------+----------------+
|00000000| 5d 52 e1 13 91 3f b5 10 51 7b 34 f6 79 a0 39 aa |]R...?..Q{4.y.9.|
|00000010| 3a 4d fc ce 3c 68 3b 15 b5 ad 02 63 d4 f7 a0 78 |:M..<h;....c...x|
|00000020| 92 29 39 d5 39 01 a9 44 b9 c6 33 2a 8d 7e 91 f3 |.)9.9..D..3*.~..|
|00000030| 28 ea 71 58 a6 09 25 58 2f a8 4e 94 f1 24 02 65 |(.qX..%X/.N..$.e|
|00000040| a7 b5 48 11 7c 6a 95 05 d2 a8 b0 93 a1 02 d8 57 |..H.|j.........W|
|00000050| 11 04 21 ad 32 b4 2d 94 16 99 af 0e f4 00 41 2a |..!.2.-.......A*|
|00000060| 80 04 55 04 32 94 41 38 93 40 01 11 25 02 94 89 |..U.2.A8.@..%...|
|00000070| 1a 39 2a 2d 84 b0 04 12 d7 2a cd 22 60 a7 34 27 |.9*-.....*."`.4'|
|00000080| 40 94 0f 0d 0d 53 0d 85 38 5d bb 24 ea af 94 32 |@....S..8].$...2|
|00000090| 84 80 24 ad 3a 63 0c e8 f8 77 85 44 8a a0 14 1a |..$.:c...w.D....|
|000000a0| 02 74 27 51 ab e6 25 02 c6 4e a8 ed 4e 64 68 4a |.t'Q..%..N..NdhJ|
|000000b0| 4e 84 09 cf 8c 7a 0e f4 65 4c f8 2b 00 28 14 f1 |N....z..eL.+.(..|
|000000c0| 98 f0 54 0a a2 a0 24 9b 48 42 2e ed 20 ad ae 53 |..T...$.HB.. ..S|
|000000d0| a8 f0 12 a9 80 9a 05 11 a3 53 2a 72 4f 3d 70 ae |.........S*rO=p.|
|000000e0| 07 9a 61 20 27 e7 8c 27 27 2d 6a a3 ff d9 f2 d3 |..a '..''-j.....|
|000000f0| 22 9e 33 37 d2 2e 06 03 e0 06 a6 fc 72 02 |".37........r. |
+--------+-------------------------------------------------+----------------+
11:16:18.312 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] READ COMPLETE
11:16:18.312 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] INACTIVE
11:16:18.316 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] CLOSE
11:16:18.316 [aws-java-sdk-NettyEventLoop-1-3] DEBUG i.n.handler.logging.LoggingHandler - [id: 0xffce7082, L:/192.168.0.165:59528 ! R:com-revl-revlx.s3.us-west-2.amazonaws.com/52.218.224.248:443] UNREGISTERED
Steps to Reproduce (for bugs)
Here is a slightly modified version of the Kotlin code that produces the above (modified to remove company sensitive stuff):
object Main {
val log = getLogger { }
val s3AsyncClient = S3AsyncClient.builder()
.credentialsProvider(DefaultCredentialsProvider.create())
.region(Region.US_WEST_2)
// retries disable here because we can't retry in our situation - the real use case is to stream
// data from a server to a customer. Note that you get the same failures even with retries -- it
// just takes longer
.overrideConfiguration { config ->
config.retryPolicy(RetryPolicy.none()).build()
}
.build()
val daBucket = "com-revl-revlx"
fun doDownload(key: String) {
val outFile = Paths.get("/tmp/experiment/").resolve(key)
outFile.parent.toFile().mkdirs()
if (outFile.toFile().exists()) {
outFile.toFile().delete()
}
log.info("Downloading via toFile: {}", key)
val request = GetObjectRequest.builder()
.bucket(daBucket)
.key(key)
.build()
val future = s3AsyncClient.getObject(request, AsyncResponseTransformer.toFile(outFile))
try {
future.join()
log.info("Download of {} succeeded", key)
} catch (t: Throwable) {
log.error("Download of {} failed with:", key, t)
throw t
}
}
@JvmStatic
fun main(argv: Array<String>) {
ThreadUtils.setLoggingDefaultExceptionHandler()
val keys = (1..240).toList()
var numDone = 1
for (i in keys) {
log.info("Process file number {}", numDone)
++numDone
val key = String.format("path/to/photo_%05d.jpg", i)
doDownload(key)
}
}
}
Context
We provide photos to our customers and are trying to provide a “download all” function. To do this we need to stream the data from S3 through some code that incrementally adds it to a zip file which is provided to the customer. Since there’s so much data we can’t hold it all in RAM so we need to do this async-streaming processing. We have our own AsyncResponseTransformer
which does this but wanted to provide an example with as little of our code as possible. AsyncResponseTransformer.toFile
and our code experience the same error.
Your Environment
- AWS Java SDK version used: 2.5.4
- JDK version used: Kotlin 1.2.71 running on Java 8 JVM
- Operating System and version: have reproduced on Debian Linux Stretch and OSX
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 50 (24 by maintainers)
Commits related to this issue
- Merge pull request #1122 from aws/zoewang/addIotWireless Exclude iotwireless — committed to aws/aws-sdk-java-v2 by zoewangg 4 years ago
Hi @jaylynstoesz, I see you’re using SQS so can you please open a new github issue, with a sample code we can use to repro? Setting
connectionMaxIdleTime
to 5 seconds is a suggestion tailored for S3 clients.I’m seeing this issue in the latest version of the SDK. I’m reading from SQS via Alpakka using the async client. I tried setting the
connectionMaxIdleTime
to 5 seconds and I’m still getting the same error. Kotlin: 1.3.41 Netty & SQS Java SDK: 2.9.15Update: unfortunately, it will take longer time than expected to fix this issue.
UPDATE: Please note that using HTTP is not as secure as HTTPS and is vulnerable to security attacks https://en.wikipedia.org/wiki/HTTPS#Difference_from_HTTP. We do not recommend using it.
Hi @pkgonan. For error messages that include
Channel was closed before it could be written to
specifically, it is likely related to a pending fix that is currently being worked on in https://github.com/aws/aws-sdk-java-v2/pull/2883 (and I believe this is unrelated to the issue that this thread was originally created for). Please follow that PR and let us know if it resolves your issue once it is released.We are experiencing a similar issue, but with DynamoDB:
The application is a worker, consuming from kinesis, using KCL, and reading and writing to dynamo.
The stacktrace when using aws-sdk-java-v2
2.5.11
: Note: when we tried the version aws sdk 2.5.25, the exception is not thrown, but the code is stuck forever trying to read/write from dynamo (we call.await()
in aGlobalScope.async {}
, waiting for dynamo complete).Are these issues related? I’m thinking in use this lib: https://github.com/bradynpoulsen/aws-sdk-kotlin-coroutines/, that is a wrapper for the aws-sdk-java v1, but with support to coroutines out of the box. It should be less efficient, but it may work.
I tried to use
http
as the protocol in dynamo endpoint and also setconnectionMaxIdleTime
to 5 seconds, without success. The problem happens after some minutes that the application is running when using with KCL v2, and after a greater period when using KCL v1.I think we’ve experienced this same behavior before, with s3, but we change to use s3 from aws v1, and the issue has gone away. Now, we are experiencing the issue with dynamo.
Environment: