java-operator-sdk: Operator should trigger the error state of the CR when deserialization fails

Bug Report

When the deserialization of the CR fails the operator should go into the error state (eventually retry the reconcile loop and possibly update the status with the error)

What did you do?

An unrecognized field in a CR will cause the operator to fail the deserialization but the operator stays in running state

What did you expect to see?

The Operator would update the error status of the CR or at minimum, it should crash since an unmatched exception has been thrown.

What did you see instead? Under which circumstances?

The Operator should at least go in CrashLoopBackoff.

Environment

Kubernetes cluster type:

minikube

$ Mention java-operator-sdk version from pom.xml file

Quarkus SDK 3.0.7

$ java -version

Java 11

Reproduction

kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/18.0.0/kubernetes/keycloaks.k8s.keycloak.org-v1.yml
kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/18.0.0/kubernetes/keycloakrealmimports.k8s.keycloak.org-v1.yml
kubectl apply -f https://raw.githubusercontent.com/keycloak/keycloak-k8s-resources/18.0.0/kubernetes/kubernetes.yml

and kubectl apply this resource:

apiVersion: k8s.keycloak.org/v2alpha1
kind: Keycloak
metadata:
  name: example-keycloak
spec:
  disableDefaultIngress: false
  hostname: INSECURE-DISABLE
  tlsSecret: INSECURE-DISABLE

Resulting StackTrace:

2022-04-21 15:05:31,315 ERROR [io.fab.kub.cli.dsl.int.AbstractWatchManager] (OkHttp https://10.96.0.1/...) Invalid event type: java.lang.IllegalArgumentException: Failed to deserialize WatchEvent
	at io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager.contextAwareWatchEventDeserializer(AbstractWatchManager.java:253)
	at io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager.readWatchEvent(AbstractWatchManager.java:259)
	at io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager.onMessage(AbstractWatchManager.java:284)
	at io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener.onMessage(WatcherWebSocketListener.java:68)
	at io.fabric8.kubernetes.client.okhttp.OkHttpWebSocketImpl$BuilderImpl$1.onMessage(OkHttpWebSocketImpl.java:97)
	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:322)
	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
	at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "disableDefaultIngress" (class org.keycloak.operator.v2alpha1.crds.KeycloakSpec), not marked as ignorable (7 known properties: "defaultIngressDisabled", "serverConfiguration", "unsupported", "image", "instances", "hostname", "tlsSecret"])
 at [Source: UNKNOWN; byte offset: #UNKNOWN] (through reference chain: org.keycloak.operator.v2alpha1.crds.Keycloak["spec"]->org.keycloak.operator.v2alpha1.crds.KeycloakSpec["disableDefaultIngress"])
	at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
	at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1127)
	at com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1989)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1700)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1678)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:319)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:176)
	at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
	at io.fabric8.kubernetes.client.utils.serialization.SettableBeanPropertyDelegate.deserializeAndSet(SettableBeanPropertyDelegate.java:131)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:313)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:176)
	at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
	at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4650)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2831)
	at com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:3295)
	at io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager.contextAwareWatchEventDeserializer(AbstractWatchManager.java:248)
	... 14 more

but the operator is still running.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 25

Most upvoted comments

Sorry for the late reply,

@csviri do you have any link regarding the usage of events solely for “cluster state” events? Super interested in understanding this more!

For this specific case I think that this is a decent UX:

  • a CR is created by the user
  • an unexpected exception is thrown in the operator informer when trying to access the resource
  • an event is emitted on the CR marking it as “failed” (or something on this line)

In this way, people checking the CR itself will have the information about why the status is not getting updated.

👍 I just tried to disable the stale condition 🙂

I think what @andreaTP means that, when an error occurs during de-serialization of a resource, we could try to de-serialize it to GenericKubernetesResource. And the error handler could work with that from that point.

@csviri instantiating an “untyped” (e.g. using GenericKubernetesResource) Informer might be one way of doing it.