kubernetes: Watcher error loop "Unable to decode an event from the watch stream" from bad CR manifests

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

We are developing controller which uses an cache.SharedIndexInformer to watch events against our custom resource. However, when kubectl create -f manifest.yaml is run using a bad manifest, specifically one that fails to unmarshall into the CR datastructure due to mismatched field data types, the watcher will fall into an infinite loop and begin rapidly spewing out the following errors in a tight loop:

E1229 01:59:53.317248    3559 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: v1alpha1.Foo: Spec: v1alpha1.FooSpec: DeploymentName: ReadString: expects " or n, but found [, error found in #10 byte of ...|entName":["example-f|..., bigger context ...|1e7-b3e3-4e0b1a127ace"},"spec":{"deploymentName":["example-foo"],"replicas":1}}|...
E1229 01:59:53.319798    3559 streamwatcher.go:109] Unable to decode an event from the watch stream: unable to decode watch event: v1alpha1.Foo: Spec: v1alpha1.FooSpec: DeploymentName: ReadString: expects " or n, but found [, error found in #10 byte of ...|entName":["example-f|..., bigger context ...|1e7-b3e3-4e0b1a127ace"},"spec":{"deploymentName":["example-foo"],"replicas":1}}|...

When the watch gets into this state, the controller ceases to function since its informer no longer sends any events from the watch. As soon as the malformed CR is deleted, theses errors stop and the informer starts sending events again.

What you expected to happen:

The watcher should not fall into an tight error loop preventing the watch from functioning. The malformed CR should be ignored.

How to reproduce it (as minimally and precisely as possible): This is reproducible using the sample-controller example: https://github.com/kubernetes/sample-controller:

$ go get k8s.io/sample-controller
$ go run *.go -kubeconfig=$HOME/.kube/config

kubectl create -f the following malformed manifest

apiVersion: samplecontroller.k8s.io/v1alpha1
kind: Foo
metadata:
  name: example-foo
spec:
  # deploymentName is supposed to be a string and not a list
  deploymentName: [example-foo]
  replicas: 1

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.5", GitCommit:"cce11c6a185279d037023e02ac5249e14daa22bf", GitTreeState:"clean", BuildDate:"2017-12-07T18:09:00Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"0b9efaeb34a2fc51ff8e4d34ad9bc6375459c4a4", GitTreeState:"clean", BuildDate:"2017-11-29T22:43:34Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: minikube version: v0.24.1 on macOS
OS (e.g. from /etc/os-release): macOS
Kernel (e.g. uname -a): Darwin Kernel Version 17.2.0: Fri Sep 29 18:27:05 PDT 2017; root:xnu-4570.20.62~3/RELEASE_X86_64
Install tools: minikube

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 1
Comments: 15 (5 by maintainers)

Most upvoted comments

In case anyone else requires the workaround for this. This was the change necessary to protect our controller from bad manifests: https://github.com/argoproj/argo/commit/8d96ea7b1b1ba843eb19a0632bc503d816ab9ef3

jessesuen on May 26, 2018