okhttp: java.net.SocketTimeoutException from HTTP/2 connection leaves dead okhttp clients in pool

Tried writing a unit test w/ TestButler on Android w/ no luck, so I’ll write up the steps to reproduce this and include some sample code. This happens if you connect to an HTTP/2 server and your network goes down while the okhttp client is connected to it:

create an okhttp client
tell it to read from the HTTP/2 server
bring the network down
tell it to read from the HTTP/2 server (it’ll get a SocketTimeoutException)
bring the network back up
tell it to read from the HTTP/2 server again (it’ll be stuck w/ SocketTimeoutExceptions)
if you create new http clients at this point, it’ll work, but the dead http client will eventually come back in the pool and fail.

okhttp client should attempt to reopen the HTTP/2 connection instead of being stuck in this state

Code sample for Android (create a trivial view w/ a button and a textview):

public class MainActivity extends AppCompatActivity {
    OkHttpClient okhttpClient = new OkHttpClient();

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        Button loadButton = (Button) findViewById(R.id.loadButton);
        TextView outputView = (TextView) findViewById(R.id.outputView);

        loadButton.setOnClickListener(view -> Observable.fromCallable(() -> {
                    Request request = new Request.Builder()
                            .url(<INSERT URL TO YOUR HTTP/2 SERVER HERE>)
                            .build();

                    Response response = okhttpClient.newCall(request).execute();

                    return response.body().string();
                })
                .subscribeOn(Schedulers.io())
                .observeOn(AndroidSchedulers.mainThread())
                .subscribe(outputView::setText, t -> outputView.setText(t.toString()))
        );
    }
}

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 31
Comments: 148 (21 by maintainers)

Links to this issue

Commits related to this issue

Configure http/2 ping interval Help reap zombie connections. https://github.com/square/okhttp/issues/3146#issuecomment-418168869 — committed to spotify/styx by deleted user 6 years ago
Added workaround for okhttp issue https://github.com/square/okhttp/issues/3146 — committed to snabble/Android-SDK by ajungg 5 years ago
Force HTTP/1.1 in the player OkHttp has a bug leading to the player hanging when rapidly seeking through the video: https://github.com/square/okhttp/issues/3146. — committed to proxer/ProxerAndroid by rubengees 5 years ago
TaskRunner, an abstraction over ExecutorService I want to tighten up our executors for a few reasons - Fix daemon vs. non-daemon problems - Fix code unloading problems - Be able to wait for async ... — committed to square/okhttp by swankjesse 5 years ago
TaskRunner, an abstraction over ExecutorService I want to tighten up our executors for a few reasons - Fix daemon vs. non-daemon problems - Fix code unloading problems - Be able to wait for async ... — committed to square/okhttp by swankjesse 5 years ago
TaskRunner, an abstraction over ExecutorService I want to tighten up our executors for a few reasons - Fix daemon vs. non-daemon problems - Fix code unloading problems - Be able to wait for async ... — committed to square/okhttp by swankjesse 5 years ago
Degrade connections after a timeout This is based roughly on the 'Degraded Connections' proposal here https://github.com/square/okhttp/issues/3146#issuecomment-471196032 I'm using 1000 ms instead of... — committed to square/okhttp by swankjesse 5 years ago
Degrade connections after a timeout This is based roughly on the 'Degraded Connections' proposal here https://github.com/square/okhttp/issues/3146#issuecomment-471196032 I'm using 1000 ms instead of... — committed to square/okhttp by swankjesse 5 years ago
Degrade connections after a timeout (3.14.x branch) This is a manual cherry-pick of 09da07c2c8981f88346adb818ce42512d9f2f288 See also the degraded connections proposal. https://github.com/square/okh... — committed to square/okhttp by swankjesse 4 years ago
Degrade connections after a timeout (3.14.x branch) This is a manual cherry-pick of 09da07c2c8981f88346adb818ce42512d9f2f288 See also the degraded connections proposal. https://github.com/square/okh... — committed to square/okhttp by swankjesse 4 years ago
Degrade connections after a timeout (3.12.x branch) This is a cherry-pick of 6a9a64c8f131b33bdd9b7077ce4e2456db0dcd19 See also the degraded connections proposal. https://github.com/square/okhttp/iss... — committed to square/okhttp by swankjesse 4 years ago
Degrade connections after a timeout (3.12.x branch) This is a cherry-pick of 6a9a64c8f131b33bdd9b7077ce4e2456db0dcd19 See also the degraded connections proposal. https://github.com/square/okhttp/iss... — committed to square/okhttp by swankjesse 4 years ago
Workaround for https://github.com/square/okhttp/issues/3146 — committed to SonarSource/orchestrator by henryju 4 years ago
Bump okhttp3 to 3.14.9. According to https://github.com/square/okhttp/issues/3146#issuecomment-569986444 an issue with stale connections that caused SocketTimeoutException errors was fixed in 3.14.5. — committed to atlassian-labs/atlassian-slack-integration-server by utluiz 3 years ago
Bump okhttp3 to 3.14.9. According to https://github.com/square/okhttp/issues/3146#issuecomment-569986444 an issue with stale connections that caused SocketTimeoutException errors was fixed in 3.14.5. — committed to mgoyal2-atl/atlassian-slack-integration-server by utluiz 3 years ago
Workaround for OkHttp Interrupt issues. Relates to https://github.com/square/okhttp/issues/3146. This was from https://github.com/androidx/media/pull/71. There is a draft PR https://github.com/squar... — committed to androidx/media by yschimke 2 years ago
Workaround for OkHttp Interrupt issues. Relates to https://github.com/square/okhttp/issues/3146. This was from https://github.com/androidx/media/pull/71. There is a draft PR https://github.com/squar... — committed to google/ExoPlayer by yschimke 2 years ago
[#185024471] Initial Subscription inaccuracies still exist - [x] close down all cached connections upon socket time out (see)[https://github.com/square/okhttp/issues/3146] - [x] units and impls — committed to xenonview-com/view-java-sdk by lwoydziak a year ago

Most upvoted comments

I think i’m seeing another manifestation of this on 3.5.0, when the server forcibly closes the connection.

We try to establish both a h2 and http1.1 connection. The server responds with 200 to both:

06-26 15:07:55.286 22094 22380 I okhttp3.OkHttpClient: --> GET<url> http/1.1
06-26 15:07:55.524 22094 22380 I okhttp3.OkHttpClient: --> GET<url> h2

06-26 15:07:55.596 22094 22380 I okhttp3.OkHttpClient: <-- 200  <url> (71ms)
06-26 15:07:55.597 22094 22380 I okhttp3.OkHttpClient: <-- 200  <url> (303ms)

Then at some point we try to read from the http2 connection, which fails in checkNotClosed and throws a StreamResetException

06-26 15:06:01.560 22094 22126 I MyProject: Caused by: okhttp3.internal.http2.StreamResetException: stream was reset: PROTOCOL_ERROR
06-26 15:06:01.560 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Stream$FramedDataSource.checkNotClosed(Http2Stream.java:428)
06-26 15:06:01.560 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Stream$FramedDataSource.read(Http2Stream.java:330)
06-26 15:06:01.560 22094 22126 I MyProject: 	at okio.ForwardingSource.read(ForwardingSource.java:35)
06-26 15:06:01.560 22094 22126 I MyProject: 	at okio.RealBufferedSource$1.read(RealBufferedSource.java:409)
06-26 15:06:01.560 22094 22126 I MyProject: 	at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:699)
06-26 15:06:01.560 22094 22126 I MyProject: 	at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:424)

Then, since this is media, we do something that causes a seek to 0 in the media, which needs to reopen the request from the beginning. At this point, we see the same exception as is posted above:

06-26 15:08:39.387 22094 22126 I MyProject: Caused by: java.net.SocketTimeoutException: timeout
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:587)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:595)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Stream.getResponseHeaders(Http2Stream.java:140)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:115)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179)
06-26 15:08:39.387 22094 22126 I MyProject: 	at okhttp3.RealCall.execute(RealCall.java:63)

this seems to be very similar to the other cases here, which seem to all be related to an ungraceful shutdown of the connection, and it remaining pooled.

I’ve also confirmed that disabling the ConnectionPool “works around” this issue:

OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder()
            .connectTimeout(connectTimeoutMillis, TimeUnit.MILLISECONDS)
            .retryOnConnectionFailure(true)
            .readTimeout(readTimeoutMillis, TimeUnit.MILLISECONDS).connectionPool(new ConnectionPool(0, 1, TimeUnit.NANOSECONDS));

+43

jpearl on Jun 26, 2017

Also still getting this problem on emulator with api 22, and 3.14.4. Also I get a SocketTimeoutException after 2 minutes (what my readTimeout is set to), instead of 10 seconds (what my connectTimeout is set to). The workaround using .connectionPool(new ConnectionPool(0, 1, TimeUnit.NANOSECONDS)) still works. I’d say it’s time to re-open this 😦. Steps to reproduce are same as OP.

I can confirm the issue doesn’t exist when using a real device Note 9, API 29.

+13

CarsonRedeye on Mar 20, 2020

Degraded Connections

Here’s a proposal for a fix.

When the HTTP/2 reader hasn’t received any frames for 500 ms and a stream times out on a read, we degrade the HTTP/2 connection by setting a new degraded field to true. The stream remains degraded until any data is received. The connection pool will not return degraded connections. Instead it will establish new connections.

When a connection becomes degraded we also send a degraded ping and set a new awaitingDegradedPong field to true. We have at most one degraded ping in flight at a time. The motivation of this ping is to trigger a pong to be received.

500 ms?

Thrashing in and out of the degraded state will be bad for performance if a busy connection has a few bad streams. If the connection has received something within 500 ms, it’s likely a bad stream and not a bad connection.

Interaction with Ping Interval?

The pings here are independent of the OkHttpClient’s pingInterval, if one is set.

Drawbacks

The HTTP/2 code is pretty busy already, and this adds more. Keeping a timestamp of the most recent frame could be particularly annoying. We should use nanoTime(), not currentTimeMillis() for this.

This addresses read timeouts only. We can’t ping our way out of write timeouts; the pings will be queued up behind other outbound data! I need to study this further.

+11

swankjesse on Mar 9, 2019

FYI, we found a workaround…set the connectionPool in the builder so it uses a new connection pool w/ a size of zero and also turn off HTTP/2 support by setting a new protocolList in the builder with only HTTP/1.1 support.

+10

kenyee on Jan 31, 2017

This is fixed 4.3. Keeping this open until I backport #5638 to 3.12.x and 3.14.x.

swankjesse on Dec 31, 2019

Guys. Be aware of the temporary bug fix of disabling the connection pool cache.

new ConnectionPool(0, 1, TimeUnit.NANOSECONDS)

We began to receive a lot of complaints about our app hanging from our users and we started to explore and profile our app to check what might be the problem. After a lot of search we found out that our app was allocating very fast a lot of objects in a short amount of time. First we saw a lot of this logs relative to Garbage collector

zygote: Background concurrent copying GC freed 112219(7MB) AllocSpace objects, 8(2MB) LOS objects, 59% free, 6MB/15MB, paused 538us total 114.061ms

Then we found out this when profiling the app

Normally you would find in first position of a dump, primitive objects like “int”, “char”, etc…

okhttp_error

Everytime we make a request a new connection is put in ConnectionPool which triggers a cleanupRunnable which in turn in runs a while(true) loop. Insied this infinite loop a method cleanUp() is called that in turn loops the connections list using an iterator of an ArrayDeque that creates a new Deque Object every time it is called, thus allocating Deque objects without mercy. Because of the rate of object creation, the gc enter in action a lot of time to try to free up memory, and it had a side effect. It was blocking our app background threads, thus blocking the app flow.

The gc was in concurrent mode, and this mode does not blocks app threads, but the reality is that they were being blocked anyway.

This allocated dequeue objects eventually will be destroyed by the GC after some time, but the issue here is the rate of object creation that triggers the GC a lot of times when a http request is made.

ruieduardosoares on Jul 18, 2019

Still having problems with 3.12.12 on Samsung Galaxy A7 (2018) SM-A750FN/DS, Android 10 (One UI 2.0).

Unless I set custom parameters as mentioned above:

.connectionPool(ConnectionPool(0, 5, TimeUnit.MINUTES))
.protocols(listOf(Protocol.HTTP_1_1))

janbolat on Jun 13, 2020

okhttp: 3.11 SocketTimeoutException is not fixed still its appears

williams98 on Jul 25, 2018

Thanks for the repro.

yschimke on Oct 1, 2021

You dont need to disable connectionPool, just insert inside your BroadcastReceiver when the network changes the following code

BroadcastReceiver networkStateReceiver = new BroadcastReceiver() {

 @Override
 public void onReceive(Context context, Intent intent) {
     final ConnectivityManager connectivityManager = (ConnectivityManager) context.getSystemService(Context.CONNECTIVITY_SERVICE);
     final NetworkInfo activeNetInfo = connectivityManager.getActiveNetworkInfo();
     if (activeNetInfo != null) {
          //clear here the offline pool when got online
          getInstance().okClient.connectionPool().evictAll(); 
     }
 }
 };
 IntentFilter filter = new IntentFilter(ConnectivityManager.CONNECTIVITY_ACTION);
 ApplicationLoader.appContext.registerReceiver(networkStateReceiver, filter);

tucomel on Aug 12, 2018

As socket timeout exception is an instance of IO exception, I am not sure if the following approach will work. Can one of you pls get back to me?

I am calling evictAll() in the catch block of IOException.

try {
          response = client.newCall(request).execute();
          statusCode = response.code();
          responseBody = response.body().string();
      } catch (IOException ioe) {
          client.connectionPool().evictAll();
      } finally {
          if (response != null) {
              response.body().close();
          }
   
      }

Also how do we check if a connection is stale or not?

With Apache HttpClient, there is a way to do it to set a flag for checking stale connections. Wondering how OkHttp3 checks for it internally before it uses the connection.

CloseableHttpClient client = HttpClients.custom().setDefaultRequestConfig(
    RequestConfig.custom().setStaleConnectionCheckEnabled(true).build()
).setConnectionManager(connManager).build();

servlette on Mar 16, 2017

In the last month, since we had this issue crop up, we had 14 occurrences, across 5 OS versions, 6 manufacturers and 12 models.

OS Versions: Android 12 - 5 instances Android 10 - 4 instances Android 11 - 2 instances Android 8.1.0 - 2 instances Android 9 - 1 instances.

Models:
Archos Alba - 2 instances Samsung Galaxy A52s 5G - 1 instances Xiaomi 11T Pro - 1 instances Xiaomi Poco X3 NFC - 1 instances Google Pixel 4A - 1 instances Samsung Galaxy A12 - 1 instances Samsung Galaxy S20 FE - 1 instances Samsung Galaxy S8 - 1 instances Samsung Galaxy S9 - 1 instances Samsung Galaxy S9+ - 1 instances Sony Xperia 10 III - 1 instances Motorola E7 Power - 1 instances

I’ve just pushed and update to our users, changing the connection pool and protocols, as per one of the first posts.

.connectionPool(ConnectionPool(0, OKHTTP_CONNECTION_KEEP_ALIVE_DURATION, TimeUnit.MINUTES))
.protocols(listOf(Protocol.HTTP_1_1))

I’m unable to provide any more info for the time being, we’ve mostly run into this issue when using our ForceUpdateInterceptor to, well, force our users to update their application. Here is the code snippet:

class ForceUpdateInterceptor : Interceptor {

    companion object {
        private const val FORCE_UPDATE_HTTP_CODE = 443
    }

    @Throws(IOException::class)
    override fun intercept(chain: Interceptor.Chain): Response = chain.run {
        val request = this.request()
        val response = chain.proceed(request)
        if (response.code == FORCE_UPDATE_HTTP_CODE) RxBus.publish(RxEvent.ForceUpdateEvent())
        response
    }
}

I’ll report back whether the aformentioned suggestion still produces the issue.

This was all with OkHttp version 5.0.0-alpha.7 and previous alphas.

rolandsarosy-verycreatives on Jun 1, 2022

I think the correct fix for now is in Media3/ExoPlayer, adding an explicit response.close()

  @Override
  public int read(byte[] buffer, int offset, int length) throws HttpDataSourceException {
    try {
      return readInternal(buffer, offset, length);
    } catch (IOException e) {
      if (e instanceof InterruptedIOException) {
        response.close();
      }

      throw HttpDataSourceException.createForIOException(
          e, castNonNull(dataSpec), HttpDataSourceException.TYPE_READ);
    }
  }

yschimke on Jan 17, 2022

Hi @swankjesse ,

I am able to reproduce such issue on ExoPlayer v2.15.1 (OkHttp v4.9.1).

E/ExoPlayerImplInternal: Playback error
      com.kaltura.android.exoplayer2.ExoPlaybackException: Source error
        at com.kaltura.android.exoplayer2.ExoPlayerImplInternal.handleIoException(ExoPlayerImplInternal.java:624)
        at com.kaltura.android.exoplayer2.ExoPlayerImplInternal.handleMessage(ExoPlayerImplInternal.java:596)
        at android.os.Handler.dispatchMessage(Handler.java:102)
        at android.os.Looper.loop(Looper.java:193)
        at android.os.HandlerThread.run(HandlerThread.java:65)
     Caused by: com.kaltura.android.exoplayer2.upstream.HttpDataSource$HttpDataSourceException: java.net.SocketTimeoutException: timeout
        at com.kaltura.android.exoplayer2.ext.okhttp.OkHttpDataSource.open(OkHttpDataSource.java:291)
        at com.kaltura.android.exoplayer2.upstream.DefaultDataSource.open(DefaultDataSource.java:201)
        at com.kaltura.android.exoplayer2.upstream.StatsDataSource.open(StatsDataSource.java:84)
        at com.kaltura.android.exoplayer2.source.chunk.ContainerMediaChunk.load(ContainerMediaChunk.java:124)
        at com.kaltura.android.exoplayer2.upstream.Loader$LoadTask.run(Loader.java:409)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:764)
     Caused by: java.net.SocketTimeoutException: timeout
        at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.kt:677)
        at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.kt:686)
        at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.kt:143)
        at okhttp3.internal.http2.Http2ExchangeCodec.readResponseHeaders(Http2ExchangeCodec.kt:96)
        at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:106)
        at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:79)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
        at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
        at com.kaltura.android.exoplayer2.ext.okhttp.OkHttpDataSource.open(OkHttpDataSource.java:286)
        at com.kaltura.android.exoplayer2.upstream.DefaultDataSource.open(DefaultDataSource.java:201) 
        at com.kaltura.android.exoplayer2.upstream.StatsDataSource.open(StatsDataSource.java:84) 
        at com.kaltura.android.exoplayer2.source.chunk.ContainerMediaChunk.load(ContainerMediaChunk.java:124) 
        at com.kaltura.android.exoplayer2.upstream.Loader$LoadTask.run(Loader.java:409) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) 
        at java.lang.Thread.run(Thread.java:764)

It is quite easy to reproduce event on ExoPlayer demo app. FYI @ojw28

I tried ExoPlayer’s demo content: https://storage.googleapis.com/wvmedia/clear/hevc/tears/tears.mpd And I force OkHttp client’s protocol,


 .protocols(Arrays.asList(Protocol.HTTP_1_1, Protocol.HTTP_2))

I can see in charles that protocol is HTTP/2.0 and ALPN is h2

Steps to reproduce:

Play the content
Seek multiple times, keep seeking.
In charles, you will see that the chunk requests will start failing (IO: Stream cancelled by CLIENT)
And eventually, ExoPlayer will throw error with source error.

GouravSna on Oct 1, 2021

@vellrya if you can reproduce this, we can fix it. As is it’s unclear what the cause is, and even if it’s in OkHttp and not the OS itself.

swankjesse on Mar 28, 2021

Problem still exists in OkHttp 4.2.0, 3.14.3, 3.12.5 - checked on genymotion emulator (turn on and off airplane mode)

kenumir on Sep 12, 2019

@swankjesse but i think this is a bad solution(Timed execute ping). I think that if an Exception occurs, this connection big probability is wrong. you say There are ways a stream will time out that don’t signal a connectivity problem. This situation may occur when the ‘read byteCount’ is set too large, This situation is very rare. code:

public long read(Buffer sink, long byteCount) throws IOException {
      long read = -1;
      ErrorCode errorCode;
      synchronized (Http2Stream.this) {
        waitUntilReadable();  //start whatch dog time out 
        if (closed) {
          throw new IOException("stream closed");
        }
        errorCode = Http2Stream.this.errorCode;

        if (readBuffer.size() > 0) {
          // Move bytes from the read buffer into the caller's buffer.
          read = readBuffer.read(sink, Math.min(byteCount, readBuffer.size()));
          unacknowledgedBytesRead += read;
        }
}

So I insist that the connection is released when it is TimeoutException. or There is another way to execute ping in TimeoutException.

Caij on Sep 10, 2018

I can confirm this is still an issue.

c0dehunter on Jul 25, 2018

It would be great to get a fix for this. Any release date?

auror on Feb 14, 2018

Pretty sure this issue is another manifestation of this one:

https://github.com/square/okhttp/issues/3118

laurencedawson on Feb 2, 2017

@robertszuba I’ll take a further look, since I was able to repro with ExoPlayer with the same symptom, I was focusing on that.

It’s likely these are two separate bugs in that case.

Your repro seems quite simple, I’ll try to reproduce with it on the weekend and get back to you.

yschimke on May 5, 2022

I can’t repro with pings on (still on an emulator), so if you are ok with the additional traffic and keeping radio awake etc. That is worth trying.

The more I look into doing smart things in Android, the more I suspect that the Android network engineers know what they are doing and the defaults are pretty good.

So far I just suspect a bug in the emulator network emulation.

yschimke on Jun 2, 2019

We definitely get enough events from Android we can choose to listen to, and actively drop the connection/force close the socket. But it’s non-trivial code.

It might be best implemented as a custom Android SocketFactory, that listens for changes to the active network, and ties each socket to the network at creation time (through either default active network, or by looking at the local address).

yschimke on May 31, 2019

I’ve got a good repro in a React Native test app that shows network state and the connection pool, so I’m going to explore the best options to resolve automatically within OkHttp.

yschimke on May 28, 2019

After some debugging e found, (not sure if this helps) the following.

From what i understand this runnable, which is always running while the connection is healthy, reads from Http2Connection BufferSource that then calls a http2reader that interprets the frame in the buffer data that then callsback the handler that is the runnable itseld that then finds a http2stream by id to delegate the correct frame information to.

When turning off and then back on mobile data in android app, this Http2Connection.ReaderRunnable class stopped working, i no longer could breakpoint this runnable.

ruieduardosoares on May 15, 2019

When the phone is running in the background for a few minutes, the socket is essentially disconnected, but RealConnection.isHealthy() is true, all requests will be TimeoutException at this time, and the connecttion will always be in connection pool, subsequent requests will also be TimeoutException. Must re-kill the app to resolve

Caij on Sep 3, 2018

@swankjesse Thanks for quick response. We tried setting pingInterval(1, TimeUnit.SECOND) and it seems it is behaving properly now. I don’t want to say it’s fixed yet as we need to do more testing, but will report back after a few days.

c0dehunter on Jul 26, 2018

@c0dehunter try setting a ping interval on your OkHttpClient?

https://square.github.io/okhttp/3.x/okhttp/okhttp3/OkHttpClient.Builder.html#pingInterval-long-java.util.concurrent.TimeUnit-

swankjesse on Jul 26, 2018

it clears out on the second call. It looks like what happens is the pool gets zombie connections. Next time you grab one of the zombies out of the pool, it throws that exception but is removed. The original bug was that the zombies got stuck in the pool. That said, this isn’t great behavior either, so we’ve just left the pool size at zero…

kenyee on Jul 25, 2018

Any updates for a fix?

muhammadsr on Feb 7, 2018

@alessandrojp do you know if the ExoPlayer team is aware of this issue? We’ve run into it only with exoplayer as well.

natez0r on Sep 19, 2017

So our attempts to write to the socket are failing silently? Might need to steal the automatic pings that we added for web sockets.

swankjesse on Feb 1, 2017