kubernetes-client: Connections are leaked when java.net.ProtocolException occurs while a WebSocket establishing

Hi folks,

I’m from Eclipse Che Team, we use fabric8-kubernetes client to work with Kubernetes/OpenShift clusters.

I discovered an issue with connections leaks when java.net.ProtocolException occurs while a WebSocket establishing. An example of it is trying to execute a command in a non-existing pod.

It can be stably reproduced on Mock KubernetesServer or real K8s/OS cluster. I have prepared a test for that in my fork https://github.com/sleshchenko/kubernetes-client/blob/connectionLeaksTest/kubernetes-tests/src/test/java/io/fabric8/kubernetes/client/mock/ExecTest.java

#testConnectionLeaksAfterExecutingCommandInNonExistingPodOnMockServer for reproducing on mock kubernetes server.

#testConnectionLeaksAfterExecutingCommandInNonExistingPodOnRealServer can be used for reproducing on real Kubernetes/OpenShift server. Note that MASTER_URL and OAUTH_TOKEN should be configured.

Note that System.gc(); just accelerates appearance connection was leaked messages and this bug can be reproduced without invocation of garbage collector.

So, HOW TO REPRODUCE IT:

  1. Clone fork
  2. Run test, for example, #testConnectionLeaksAfterExecutingCommandInNonExistingPodOnMockServer.
  3. Wait ~10 minutes. For me after 5th iteration(after 5 minutes) logs appears.
...
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - #######################################################
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - ####                                               ####
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - ####                   Iteration #  5              ####
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - ####                                               ####
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - #######################################################
10:57:11.354 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - Try to execute 5 echo command in non-existing pod
Jun 04, 2018 10:57:11 AM okhttp3.mockwebserver.MockWebServer$3 processOneRequest
INFO: MockWebServer[53229] received request: GET /api/v1/namespaces/test/pods/non-existing/exec?command=sh&command=-c&command=echo%20hello&stderr=true HTTP/1.1 and responded: HTTP/1.1 404 Client Error
Jun 04, 2018 10:57:11 AM okhttp3.mockwebserver.MockWebServer$3 processOneRequest
INFO: MockWebServer[53229] received request: GET /api/v1/namespaces/test/pods/non-existing/exec?command=sh&command=-c&command=echo%20hello&stderr=true HTTP/1.1 and responded: HTTP/1.1 404 Client Error
10:57:11.359 [OkHttp http://localhost:53229/...] ERROR io.fabric8.kubernetes.client.dsl.internal.ExecWebSocketListener - Exec Failure: HTTP:404. Message:
java.net.ProtocolException: Expected HTTP 101 response but was '404 Client Error'
	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:219) [okhttp-3.9.1.jar:?]
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:186) [okhttp-3.9.1.jar:?]
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:153) [okhttp-3.9.1.jar:?]
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) [okhttp-3.9.1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_172]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_172]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
10:57:11.360 [main] ERROR io.fabric8.kubernetes.client.mock.WatchTest - ERROR: Exception occurred: 
...
10:57:11.373 [main] INFO  io.fabric8.kubernetes.client.mock.WatchTest - Wait 1 minute
Jun 04, 2018 10:58:10 AM okhttp3.internal.platform.Platform log
WARNING: A connection to http://localhost:53229/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);
Jun 04, 2018 10:58:10 AM okhttp3.internal.platform.Platform log
WARNING: A connection to http://localhost:53229/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);
Jun 04, 2018 10:58:10 AM okhttp3.internal.platform.Platform log
...

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 24 (19 by maintainers)

Commits related to this issue

Most upvoted comments

I have tried using cancel() calls at the right places in ExecWebSocketListener and this did not work. So much for that theory!

See #1198. Again, this is mainly for discussion. If this does no (further) harm, then I’ll put together a PR to do an analogous set of operations in the WatchConnectionManager.

Working on a PR for further discussion.

@hrishin Good news, try-with-resources statement works as it supposed to do. So, obviously closing of stream/connection doesn’t fix an issue.