pipeline: Pipelines with finally clause in them makes tekton-pipelines-webhook spam conversion errors between v1alpha1 and v1beta1
Expected Behavior
Pipelines with finally: in them should not make the tekton-piplines-webhook spam convert errors.
Actual Behavior
When applying any Pipeline that includes finally: in it, the tekton-pipelines-webhook starts spamming these conversion errors:
{
"level": "info",
"logger": "webhook",
"caller": "webhook/conversion.go:42",
"msg": "Webhook ServeHTTP request=&http.Request{Method:\"POST\", URL:(*url.URL)(0xc000714280), Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Accept\":[]string{\"application/json, */*\"}, \"Accept-Encoding\":[]string{\"gzip\"}, \"Content-Length\":[]string{\"2019\"}, \"Content-Type\":[]string{\"application/json\"}, \"User-Agent\":[]string{\"kube-apiserver-admission\"}}, Body:(*http.body)(0xc00084d7c0), GetBody:(func() (io.ReadCloser, error))(nil), ContentLength:2019, TransferEncoding:[]string(nil), Close:false, Host:\"tekton-pipelines-webhook.tekton-pipelines.svc:443\", Form:url.Values(nil), PostForm:url.Values(nil), MultipartForm:(*multipart.Form)(nil), Trailer:http.Header(nil), RemoteAddr:\"100.109.0.3:60412\", RequestURI:\"/resource-conversion?timeout=30s\", TLS:(*tls.ConnectionState)(0xc000788370), Cancel:(<-chan struct {})(nil), Response:(*http.Response)(nil), ctx:(*context.cancelCtx)(0xc00084d800)}",
"commit": "a162a1d"
}
{
"level": "info",
"logger": "webhook",
"caller": "conversion/conversion.go:133",
"msg": "Converting [kind=Pipeline group=tekton.dev version=v1beta1] to version tekton.dev/v1alpha1",
"commit": "a162a1d",
"uid": "b2b3d9a8-e22b-4724-8105-61ccabaa9bf5",
"desiredAPIVersion": "tekton.dev/v1alpha1",
"inputType": "[kind=Pipeline group=tekton.dev version=v1beta1]",
"outputType": "[kind=Pipeline group=tekton.dev version=v1alpha1]",
"hubType": "[kind=Pipeline group=tekton.dev version=v1alpha1]",
"knative.dev/key": "tekton-pipelines/clone-cleanup-workspace"
}
{
"level": "error",
"logger": "webhook",
"caller": "conversion/conversion.go:59",
"msg": "Conversion failed: conversion failed to version v1alpha1 for type [kind=Pipeline group=tekton.dev version=v1beta1] - the specified field/section is not available in v1alpha1",
"commit": "a162a1d",
"uid": "b2b3d9a8-e22b-4724-8105-61ccabaa9bf5",
"desiredAPIVersion": "tekton.dev/v1alpha1",
"stacktrace": "github.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook/resourcesemantics/conversion.(*reconciler).Convert\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook/resourcesemantics/conversion/conversion.go:59\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook.conversionHandler.func1\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook/conversion.go:61\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2387\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook.(*Webhook).ServeHTTP\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/webhook/webhook.go:259\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2807\nnet/http.(*conn).serve\n\tnet/http/server.go:1895"
}
Steps to Reproduce the Problem
- Create a new minikube cluster
- Apply the latest
tekton-pipelinesrelease: 0.16.0 - Apply any Pipeline with a
finally:clause in it. I used the example you have available here: tekton-pipelinerun-with-final-task - Check the logs of tekton-pipelines-webhook pod and observe the errors above, spamming forever.
- Remove the YAML applied in step 3 and the log spamming stops.
Additional Info
Have also tested this on our AKS cluster in Azure, same results as in minikube. Versions tested that showed same behavior: 0.15.0, 0.15.1, 0.15.2 and 0.16.0
-
Kubernetes version:
Output of
kubectl version:
$ k version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:30:10Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"e19964183377d0ec2052d1f1fa930c4d7575bd50", GitTreeState:"clean", BuildDate:"2020-08-26T14:23:04Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
-
Tekton Pipeline version:
Output of
tkn versionorkubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
$ tkn version
Client version: 0.11.0
Pipeline version: v0.16.0
Triggers version: unknown
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 72 (48 by maintainers)
The kube api server is repeatedly attempting to
ListAndWatchv1alpha1 versions. Here are the logs from my api server in kind:@vdemeester @pritidesai should we maybe not return a conversion error when moving down to v1alpha1? It doesn’t seem to be what the k8s api server is expecting / gracefully handling?
Edit to add: We could simply discard the finally section when down-converting to v1alpha1?
Edit to add: Here’s the kubernetes version I’m running in my kind cluster:
This happened in a brand new kind cluster on a brand new VM. It definitely looks like either we shouldn’t be returning the error at all or we’re somehow violating expectations that kubernetes has for the way conversions work.
All the time non-stop spam 😄. You will not miss it if you hit this issue.
@pritidesai We are using Tekton on OpenShift as part of the Red Hat OpenShift Pipeline Operator. The lastest version of this operator is using: Tekton Pipelines: v0.16.3. For us it’s not possible to upgrade to the latest available Tekton Pipeline version as we are using the Red Hat implementation of Tekton.
@Yannig this would remove support for
v1alpha1though (as it wouldn’t be served anymore…). I don’t see any reason why it would be magically fixed in 0.19 😓thanks @Yannig appreciate the workaround, but wouldn’t work for folks using
v1alpha1resources along withv1beta1.@vdemeester any updates on this one? will this be magically fixed in 0.19? 🤞
Nice find @pritidesai ! I think that warning is enough evidence to me that we should simply be removing
Finallyfrom thev1alpha1resources we convert down to. In a way this does make sense - anyone relying onv1alpha1should have no expectation that theirFinallyentries would be available since the feature simply didn’t exist then.It also sounds like this is only an issue because kubernetes is caching the CRD apiVersions we list in our CRDs. So at some point we may remove
v1alpha1from our CRDs and at that time we should expect that kubernetes will no longer be hitting our conversion endpoints for that version.Having written all that it seems like https://github.com/tektoncd/pipeline/pull/3757 is the right way forward. I’ll update the tests and get that PR into mergeable shape. Thanks a lot for finding all that info!
This seems like a really important point. @ljupchokotev , @wouter2397 , @coryrc - are you any of you able to reproduce this? If you stop the pipelines controller deployment do you continue to see the error messages?
At this point I’m not quite sure how to debug further if we’re not able to nail down some clearer steps to reproduce. If the above is true re: the deployment then it seems possible that the error messages are actually being generated from something that isn’t related to the pipelines controller? It would be great to get some feedback confirming whether this is the case for other folks.
😭
Ok… it´s not tekton-pipelines-controller. I just stopped the deployment and still see the conversions. In retrospect, I should have checked this earlier 🤦♂️
I still don´t know what´s causing this in my case tough…
We were facing this with
v0.19.0and I just updated one cluster tov0.20.1and still see this in the webhook log@wouter2397 I don’t have any solution yet. I am adding this in next milestone to make sure we fix this. I will resume troubleshooting next week.
The pipelines v1alpha1 API? Users can create resources with that APIVersion but they don’t have to. As far as I know, there shouldn’t be a direct dependency on the v1alpha1 API (we use a dynamic client).
In order to correct the problem, I made the modification suggested by @freefood89. If it helps anyone, it’s a matter of making the following change in the tekton manifest:
Well, I don’t know if you know it but the issue is still there with 0.18.1. I get the same kind of spam over and over until I remove the finally statement on my pipeline.
@pritidesai because it’s two commands, it make sense that those “age” are different. The would also be different if you ask twice the same resource version 😉
@freefood89 🤗 Yeah I need to dig deeper into this. I have yet to understand why the webhook or controller would go on his own to ask a
v1alpha1or a conversion… I might miss something obvious in code…Yeah, that’s what should happens, when you list
pipelines.v1alpha1.tekton.dev, it will get whatever is stored and convert it “on-the-fly” in a v1alpha1 version if it can.The “age” difference in the listing feels weird though…
So whenever you get resources from k8s the api is called with the resource:
GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE/NAMEI’m guessing that’s where the desiredVersion comes from (???) I was able to replicate it with
kubectl get pipelines.v1alpha1.tekton.devwhere k8s will call the conversion webhook to convert my v1beta1 pipelines to v1alpha1(I’m learning this as I go too haha)
@pritidesai I learned that actually
v1beta1in the CRD is whateverv1alpha1is with:as overrides. Apparently that’s what
&versionwith<<: *versiondoes. Didn’t know yaml had these capabilitiesresulting in:
Yes, unfortunately I can’t rule that out because I am not sufficiently knowledgeable in this domain.
Sometimes I think back to 5-pages of Java stacktrace fondly…