java-operator-sdk: Operator crashing after sometime

Hi, We have an operator written using this SDK, and operator pod is restarting every few hours with below exception

2020-09-16 07:51:39,953 i.f.k.c.d.i.WatchConnectionManager [DEBUG] Current reconnect backoff is 1000 milliseconds (T0)
2020-09-16 07:51:40,953 i.f.k.c.d.i.WatchConnectionManager [DEBUG] Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@71e4b308
2020-09-16 07:51:41,003 i.f.k.c.d.i.WatchConnectionManager [DEBUG] WebSocket successfully opened
2020-09-16 07:51:41,018 c.g.c.o.p.EventScheduler       [ERROR] Error:
io.fabric8.kubernetes.client.KubernetesClientException: too old resource version: 22472056 (22832853)
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:257)
	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

Code i am using is

         KubernetesClient client = new DefaultKubernetesClient();
        Operator operator = new Operator(client);
        operator.registerController(new KafkaTopicController(client));

Am i using it wrong?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16

Most upvoted comments

@PookiPok @SaikiranDaripelli the restarting of controller is the workaround basically (thus it restarts but at least the system does not stop working) 😦

We can try to improve on this in the current version, but we are working on a big change now, there it will be easiert to fix.

@SaikiranDaripelli this could be done, it would be nicer if we could do this transparently. In the case when you suggesting we should probably provide some interface how to get the latest generation from the resource (name of the field can be different from different users). So this is definitely one of the ways to go.

We will take a look, after the current changes we are working on.