operator-lifecycle-manager: OLM fails to install packageserver with FailedDiscoveryCheck error on GKE
Bug Report
What did you do?
$ git clone https://github.com/operator-framework/operator-lifecycle-manager.git
$ cd operator-lifecycle-manager
$ git describe
v0.18.2-18-g20ded32d
$ ./scripts/install.sh v0.18.2
What did you expect to see?
The script ./scripts/install.sh v0.18.2 should return success code.
What did you see instead? Under which circumstances?
The script ./scripts/install.sh v0.18.2 fails:
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com condition met
namespace/olm created
namespace/operators unchanged
serviceaccount/olm-operator-serviceaccount created
clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager unchanged
clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm unchanged
deployment.apps/olm-operator created
deployment.apps/catalog-operator created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit unchanged
clusterrole.rbac.authorization.k8s.io/aggregate-olm-view unchanged
operatorgroup.operators.coreos.com/global-operators created
operatorgroup.operators.coreos.com/olm-operators created
clusterserviceversion.operators.coreos.com/packageserver created
catalogsource.operators.coreos.com/operatorhubio-catalog created
Waiting for deployment "olm-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "olm-operator" successfully rolled out
deployment "catalog-operator" successfully rolled out
Package server phase: Installing
CSV "packageserver" failed to reach phase succeeded
Environment
- operator-lifecycle-manager version:
git SHA:
20ded32d2260a8f1eeb594b9ec2147ad0134cfc6
- Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:58:59Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.6-gke.1400", GitCommit:"b7ccf3218bdb80cd2d66ed6186d78f18f1ef03c6", GitTreeState:"clean", BuildDate:"2021-05-05T09:19:45Z", GoVersion:"go1.15.10b5", Compiler:"gc", Platform:"linux/amd64"}
- Kubernetes cluster kind:
GKE cluster.
Possible Solution
Additional context
This script works fine against a kind-based cluster.
Here is some error messages/logs I fetch on a failing cluster:
$ kubectl get apiservices v1.packages.operators.coreos.com -o yaml | tail
versionPriority: 15
status:
conditions:
- lastTransitionTime: "2021-07-05T13:17:43Z"
message: 'failing or missing response from https://10.135.4.23:5443/apis/packages.operators.coreos.com/v1:
Get "https://10.135.4.23:5443/apis/packages.operators.coreos.com/v1": dial tcp
10.135.4.23:5443: i/o timeout'
reason: FailedDiscoveryCheck
status: "False"
type: Available
$ kubectl get csv -n olm packageserver
NAME DISPLAY VERSION REPLACES PHASE
packageserver Package Server 0.18.2 Installing
$ kubectl proxy &
$ curl http://localhost:8001/logs/kube-apiserver.log | grep -i error | tail
I0705 13:28:12.534028 12 httplog.go:89] "HTTP" verb="GET" URI="/apis/packages.operators.coreos.com/v1?timeout=32s" latency="298.11µs" userAgent="kubectl/v1.20.2 (linux/amd64) kubernetes/faecb19" srcIP="[::1]:51746" resp=503 statusStack="\ngoroutine 280651052 [running]:\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc0226cabd0, 0x1f7)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:237 +0xcf\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc0226cabd0, 0x1f7)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:216 +0x35\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).WriteHeader(0xc01afbe140, 0x1f7)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:226 +0xb2\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.(*auditResponseWriter).WriteHeader(0xc02142a230, 0x1f7)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:223 +0x65\nnet/http.Error(0x547b680, 0xc024f5a4a8, 0x4c0cb67, 0x13, 0x1f7)\n\t/usr/local/go/src/net/http/server.go:2054 +0x1f6\nk8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver.proxyError(0x547b680, 0xc024f5a4a8, 0xc011833300, 0x4c0cb67, 0x13, 0x1f7)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver/handler_proxy.go:97 +0x5d\nk8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver.(*proxyHandler).ServeHTTP(0xc0063756d0, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kube-aggregator/pkg/apiserver/handler_proxy.go:126 +0xcb4\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*pathHandler).ServeHTTP(0xc013165fc0, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:241 +0x6fa\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).ServeHTTP(0xc00841ce00, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:234 +0x8c\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.director.ServeHTTP(0x4bfcd19, 0xf, 0xc0058f9b00, 0xc00841ce00, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:154 +0x87f\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:95 +0x165\nnet/http.HandlerFunc.ServeHTTP(0xc008447d40, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthorization.func1(0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/authorization.go:64 +0x59a\nnet/http.HandlerFunc.ServeHTTP(0xc00843a900, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186\nnet/http.HandlerFunc.ServeHTTP(0xc00843a940, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:95 +0x165\nnet/http.HandlerFunc.ServeHTTP(0xc008447d70, 0x547b680, 0xc024f5a4a8, 0xc011833300)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithPriorityAndFairness.func1.4()\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/priority-and-fairness.go:127 +0x3c6\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol.(*configController).Handle.func2()\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol/apf_filter.go:133 +0x1aa\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol.immediateRequest.Finish(...)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol/apf_controller.go:660\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol.(*configController).Handle(0xc000be2d10, 0x5481b40, 0xc02302eb40, 0xc0123ca790, 0x5482800, 0xc010d59200, 0xc006286310, 0xc006286320, 0xc007636ae0)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/util/flowcontrol/apf_filter.go:123 +0x86a\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithPriorityAndFairness.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/priority-and-fairness.go:130 +0x5c3\nnet/http.HandlerFunc.ServeHTTP(0xc008447da0, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186\nnet/http.HandlerFunc.ServeHTTP(0xc00843a980, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:95 +0x165\nnet/http.HandlerFunc.ServeHTTP(0xc008447dd0, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithImpersonation.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/impersonation.go:50 +0x23dd\nnet/http.HandlerFunc.ServeHTTP(0xc00843a9c0, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186\nnet/http.HandlerFunc.ServeHTTP(0xc00843aa00, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:95 +0x165\nnet/http.HandlerFunc.ServeHTTP(0xc008447e00, 0x547b680, 0xc024f5a4a8, 0xc011833200)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1(0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:110 +0x4bc\nnet/http.HandlerFunc.ServeHTTP(0xc00843aa40, 0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186\nnet/http.HandlerFunc.ServeHTTP(0xc00843aa80, 0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:95 +0x165\nnet/http.HandlerFunc.ServeHTTP(0xc008447e60, 0x7f2a1803ac98, 0xc024f5a4a0, 0xc011833100)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAuthentication.func1(0x7f2a1803ac98, 0xc024f5a4a0, 0xc011832f00)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/authentication.go:70 +0x6d2\nnet/http.HandlerFunc.ServeHTTP(0xc0083db860, 0x7f2a1803ac98, 0xc024f5a4a0, 0xc011832f00)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f2a1803ac98, 0xc024f5a4a0, 0xc011832e00)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filterlatency/filterlatency.go:80 +0x38a\nnet/http.HandlerFunc.ServeHTTP(0xc00843ab00, 0x7f2a1803ac98, 0xc024f5a4a0, 0xc011832e00)\n\t/usr/local/go/src/net/http/server.go:2042 +0x44\nk8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1(0xc00d3e20c0, 0xc008449700, 0x5482900, 0xc024f5a4a0, 0xc011832e00)\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:111 +0xb8\ncreated by k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP\n\t/workspace/louhi_ws/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:97 +0x1cc\n" addedInfo="\nlogging error output: \"service unavailable\\n\"\n"
E0705 13:28:13.621177 12 controller.go:116] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
E0705 13:28:18.632933 12 controller.go:116] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
I0705 13:28:20.924976 12 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
E0705 13:28:23.641459 12 controller.go:116] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
E0705 13:28:28.651618 12 controller.go:116] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
I0705 13:28:31.722498 12 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
E0705 13:28:33.660936 12 controller.go:116] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
$ kubectl api-resources --verbs=list --namespaced -o name | tail
error: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request
clusterserviceversions.operators.coreos.com
installplans.operators.coreos.com
operatorconditions.operators.coreos.com
operatorgroups.operators.coreos.com
subscriptions.operators.coreos.com
poddisruptionbudgets.policy
qservs.qserv.lsst.org
rolebindings.rbac.authorization.k8s.io
roles.rbac.authorization.k8s.io
volumesnapshots.snapshot.storage.k8s.io
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (2 by maintainers)
For anyone who lands here. Was having the same issue with EKS. Updating the security group to allow 5443 works
Great! I was testing out this morning allowing TCP port 5443 from master to worker nodes and that seemed to solve it if you want more specific traffic restrictions.
Hey @rblaine95, allow traffic from master towards the worker nodes on port
5443If it may help other GKE users, here is the
gcloudcommand which solved the issue:I am having exactly the same issue with 1.19.9-gke.1900. I installed it manually with crds.yaml and olm.yaml, installed it with install.sh and tried it with operator-sdk as well. The issue persisted across all of them.
I am seeing this as well:
Seems like the issue might be coming from olm-operator. I am seeing the following log recurringly:
and I get these logs from time to time
@dspeck1 @fjammes what version of GKE cluster did it work with?
Looks like we should add this to our docs. This is needed since the package server pods (
packagemanifestis an aggregated API service) need to be reachable from the k8s control plane.I allowed all ingress from master to worker nodes on the firewall and that solved the issue!
@shalin-hudda - 1.20.6-gke.1400 is version install worked on. Still puzzled. Only difference I can find so far is it looks like it works on public nodes (nodes with external IP Addresses) and not private nodes. I have Cloud NAT deployed so it does not appear to be connectivity related.
This could happen if the packageserver deployment is not full stood up ie ready and available. @fjammes It would help debug further if the packageserver CSV status was included with the other helpful logs provided here.
Another option is to use the
operator-sdk olm installcommand to see if that may work on your cluster instead.