rancher: Cluster monitoring stuck in installing state with templateversion not found error

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible): upgrade rancher-ha from v2.1.7 to v2.2.0

Result: image image

Other details that may be helpful: instance A

E0327 08:46:54.797228 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 08:46:54.797331 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
E0327 08:46:54.797361 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 08:46:54.797433 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
W0327 08:49:45.017305 7 reflector.go:270] github.com/rancher/norman/controller/generic_controller.go:175: watch of *v1.Endpoints ended with: too old resource version: 469462021 (469483799)
W0327 09:01:29.032225 7 reflector.go:270] github.com/rancher/norman/controller/generic_controller.go:175: watch of *v1.Endpoints ended with: too old resource version: 469529654 (469559088)
2019/03/27 09:05:02 [ERROR] Failed to construct object on output: info=InvalidFormat 422: cpu=InvalidFormat 422: count=InvalidFormat 422: strconv.ParseInt: parsing "1k": invalid syntax
E0327 09:07:15.363290 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 09:07:15.363290 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 09:07:15.363290 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 09:07:15.363358 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport
E0327 09:07:15.363448 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
E0327 09:07:15.363468 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
E0327 09:07:15.363452 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
E0327 09:07:15.363452 7 streamwatcher.go:109] Unable to decode an event from the watch stream: net/http: request canceled (Client.Timeout exceeded while reading body)
E0327 09:07:15.363505 7 round_trippers.go:291] CancelRequest not implemented by *rest.tokenSourceTransport

instance B

2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] Failed to construct object on output: info=InvalidFormat 422: cpu=InvalidFormat 422: count=InvalidFormat 422: strconv.ParseInt: parsing "1k": invalid syntax
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [INFO] Catalog sync done. 40 templates created, 0 templates updated, 0 templates deleted
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] CatalogController aliyun [catalog] failed with : failed to update templates. Multiple error occurred: [failed to create template aliyun-kafka: namespaces "cattle-global-data" not found failed to create template aliyun-ack-knative-serving: namespaces "cattle-global-data" not found failed to create template aliyun-ceph: namespaces "cattle-global-data" not found failed to create template aliyun-ack-knative-build: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-turbine: namespaces "cattle-global-data" not found failed to create template aliyun-ack-tensorflow-dev: namespaces "cattle-global-data" not found failed to create template aliyun-etcd: namespaces "cattle-global-data" not found failed to create template aliyun-exposecontroller: namespaces "cattle-global-data" not found failed to create template aliyun-ack-arms-pilot: namespaces "cattle-global-data" not found failed to create template aliyun-ack-istio: namespaces "cattle-global-data" not found failed to create template aliyun-nodered: namespaces "cattle-global-data" not found failed to create template aliyun-seldon-core: namespaces "cattle-global-data" not found failed to create template aliyun-jenkins-x-platform: namespaces "cattle-global-data" not found failed to create template aliyun-mysql-broker: namespaces "cattle-global-data" not found failed to create template aliyun-mariadb-broker: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-configserver: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-eureka: namespaces "cattle-global-data" not found failed to create template aliyun-ack-knative-sources: namespaces "cattle-global-data" not found failed to create template aliyun-ack-openmpi: namespaces "cattle-global-data" not found failed to create template aliyun-ack-tensorflow-serving: namespaces "cattle-global-data" not found failed to create template aliyun-catalog: namespaces
"cattle-global-data" not found failed to create template aliyun-glusterfs: namespaces "cattle-global-data" not found failed to create template aliyun-ack-consul: namespaces "cattle-global-data" not found failed to create template aliyun-ack-hyperledger-fabric: namespaces "cattle-global-data" not found failed to create template aliyun-zookeeper: namespaces "cattle-global-data" not found failed to create template aliyun-mysqlha: namespaces "cattle-global-data" not found failed to create template aliyun-rabbitmq-broker: namespaces "cattle-global-data" not found failed to create template aliyun-ack-tensorflow-training: namespaces "cattle-global-data" not found failed to create template aliyun-elasticsearch: namespaces "cattle-global-data" not found failed to create template aliyun-seldon-core-crd: namespaces "cattle-global-data" not found failed to create template aliyun-spark-broker: namespaces "cattle-global-data" not found failed to create template aliyun-vault: namespaces "cattle-global-data" not found failed to create template aliyun-ack-knative-init: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-zipkin: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-hystrix: namespaces "cattle-global-data" not found failed to create template aliyun-ack-springcloud-zuul: namespaces "cattle-global-data" not found failed to create template aliyun-redis-ha: namespaces "cattle-global-data" not found failed to create template aliyun-spark-oss: namespaces "cattle-global-data" not found failed to create template aliyun-ack-istio-remote: namespaces "cattle-global-data" not found failed to create template aliyun-ack-knative-eventing: namespaces "cattle-global-data" not found]
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:43 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:44 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:44 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/03/27 09:30:44 [INFO] Updating global catalog aliyun

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI):v2.2.0
  • Installation option (single install/HA): HA

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 32 (15 by maintainers)

Most upvoted comments

@lanmingle Is there any k8s management tools better than Rancher?

我用腾讯云: Failed to ensure catalog “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: failed to find catalog by ID “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: catalogtemplateversions.management.cattle.io “system-library-rancher-monitoring-0.0.2” not found

算了,现在有打算换个管理工具。

I can see all templates,but cluster-monitoring still stuck with installing. image

2019/04/01 01:43:04 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/04/01 01:43:04 [ERROR] AppController p-rj7s9/cluster-monitoring [helm-controller] failed with : catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found

@mrajashree I have upgraded rancher to v2.2.1,it was not working too. The resource type catalogtemplateversions does not exist.

root@Laoshancun:~# kubectl get catalogtemplateversions system-library-rancher-monitoring-0.0.2 -n cattle-global-data
error: the server doesn't have a resource type "catalogtemplateversions"