rancher: 2.2.4 System project prometheus-project-monitoring crash loop

After upgrading Rancher to 2.2.4 stable, and enabling project level monitoring, the pod prometheus-project-monitoring-0 starts to crash loop with the following error logs. Disabling and re-enabling the project level monitoring does not resolve the issue.

level=warn ts=2019-06-19T00:27:18.932421508Z caller=main.go:295 deprecation_notice="\"storage.tsdb.retention\" flag is deprecated use \"storage.tsdb.retention.time\" instead."
level=info ts=2019-06-19T00:27:18.932529973Z caller=main.go:302 msg="Starting Prometheus" version="(version=2.7.1, branch=HEAD, revision=62e591f928ddf6b3468308b7ac1de1c63aa7fcf3)"
level=info ts=2019-06-19T00:27:18.932559054Z caller=main.go:303 build_context="(go=go1.11.5, user=root@f9f82868fc43, date=20190131-11:16:59)"
level=info ts=2019-06-19T00:27:18.932583424Z caller=main.go:304 host_details="(Linux 3.10.0-957.21.2.el7.x86_64 rancher/rancher#1 SMP Wed Jun 5 14:26:44 UTC 2019 x86_64 prometheus-project-monitoring-0 (none))"
level=info ts=2019-06-19T00:27:18.932614859Z caller=main.go:305 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-06-19T00:27:18.932642277Z caller=main.go:306 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-06-19T00:27:18.933616335Z caller=main.go:620 msg="Starting TSDB ..."
level=info ts=2019-06-19T00:27:18.93371085Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-06-19T00:27:18.934338428Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1559858267204 maxt=1559865600000 ulid=01DCR576WWW95XYB8Z853GMGVZ
level=info ts=2019-06-19T00:27:18.934490313Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1559865600000 maxt=1560060000000 ulid=01DCXQQA361B45GEW414BTKTYH
level=info ts=2019-06-19T00:27:18.934636704Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1560060000000 maxt=1560254400000 ulid=01DD3GZ7N6QCP4WFJZDSB4DSCM
level=info ts=2019-06-19T00:27:18.934726704Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1560276000000 maxt=1560283200000 ulid=01DD45JZ9HJ878W7JP601KM56Y
level=info ts=2019-06-19T00:27:18.934803436Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1560283200000 maxt=1560290400000 ulid=01DD4CF2YB4ZP74MVABGWYFXRN
level=info ts=2019-06-19T00:27:18.935901826Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1560254400000 maxt=1560276000000 ulid=01DD4CF3A7JQFZVGT5MW5KMD9Y
level=info ts=2019-06-19T00:27:18.936060349Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1560290400000 maxt=1560297600000 ulid=01DD4KP55TC24FD527GGR2WPRM
level=warn ts=2019-06-19T00:27:18.938634904Z caller=wal.go:116 component=tsdb msg="last page of the wal is torn, filling it with zeros" segment=/prometheus/wal/00000000
level=info ts=2019-06-19T00:27:19.156931758Z caller=main.go:635 msg="TSDB started"
level=info ts=2019-06-19T00:27:19.157031589Z caller=main.go:695 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-06-19T00:27:19.200513948Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-06-19T00:27:19.201886938Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-06-19T00:27:19.202853488Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-06-19T00:27:19.203902277Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-06-19T00:27:19.212187911Z caller=main.go:722 msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-06-19T00:27:19.212265255Z caller=main.go:589 msg="Server is ready to receive web requests."
level=error ts=2019-06-19T00:27:19.2226977Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:19.222980112Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:19.224614597Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:19.224801021Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:19.224844531Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:19.2248589Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:20.224805245Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:20.225246318Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:20.226318651Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:20.227352837Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:20.228495013Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:20.229894705Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:21.226947201Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:21.227502295Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:21.22820036Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:21.229282217Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:21.230409372Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:21.231470813Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:22.229093021Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:22.229166275Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:22.230345567Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:22.231349462Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:22.232478049Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:22.233623141Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:23.231280471Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:23.231512079Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:23.232527449Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:23.233572405Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:23.234772257Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:23.235696581Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:24.213839929Z caller=scrape.go:147 component="scrape manager" scrape_pool=cattle-prometheus/exporter-kube-etcd-cluster-monitoring/0 msg="Error creating HTTP client" err="unable to use specified client cert (/etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-10-1-101-11.pem) & key (/etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-10-1-101-11-key.pem): open /etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-10-1-101-11.pem: no such file or directory"
level=error ts=2019-06-19T00:27:24.232997861Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:24.233496136Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:24.234589335Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:24.235851974Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:24.236824986Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:24.23799358Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:25.234784474Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:25.235294134Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:25.236510119Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:25.237416525Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"cattle-logging\""
level=error ts=2019-06-19T00:27:25.238616313Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-06-19T00:27:25.239599264Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-knlv8:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"cattle-logging\""
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x669c12]
goroutine 706 [running]:
net/http.(*Client).deadline(0x0, 0xc0007f40c8, 0x40bb8f, 0xc0007ed230)
/usr/local/go/src/net/http/client.go:187 +0x22
net/http.(*Client).do(0x0, 0xc0002ffa00, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/client.go:527 +0xab
net/http.(*Client).Do(0x0, 0xc0002ffa00, 0x23, 0xc000791570, 0x9)
/usr/local/go/src/net/http/client.go:509 +0x35
github.com/prometheus/prometheus/scrape.(*targetScraper).scrape(0xc001c7b830, 0x1fd4a60, 0xc0007fef00, 0x1fb2760, 0xc000334690, 0x0, 0x0, 0x0, 0x0)
/app/scrape/scrape.go:471 +0x111
github.com/prometheus/prometheus/scrape.(*scrapeLoop).run(0xc001d28d80, 0xdf8475800, 0x2540be400, 0x0)
/app/scrape/scrape.go:813 +0x487
created by github.com/prometheus/prometheus/scrape.(*scrapePool).sync
/app/scrape/scrape.go:336 +0x45d

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): 2.2.4-stable
  • Installation option (single install/HA): HA

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): Custom
  • Machine type (cloud/VM/metal) and specifications (CPU/memory): VM/16c/32GB
  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version (use docker version): 18.09.5

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 5
  • Comments: 15 (7 by maintainers)

Most upvoted comments

Reproduced this in Rancher v2.2.8. Based on my testing, to trigger this behaviour you must enable Cluster Monitoring, then enable Project Monitoring in the System project, after which the the prometheus container in the prometheus-project-monitoring-0 Pod will keep failing. Just enabling Project Monitoring on the System project will not trigger the issue. Below example of logs from a crashed prometheus container hitting this issue:

level=warn ts=2019-09-17T10:52:28.175953564Z caller=main.go:295 deprecation_notice="\"storage.tsdb.retention\" flag is deprecated use \"storage.tsdb.retention.time\" instead."
level=info ts=2019-09-17T10:52:28.176061415Z caller=main.go:302 msg="Starting Prometheus" version="(version=2.7.1, branch=HEAD, revision=62e591f928ddf6b3468308b7ac1de1c63aa7fcf3)"
level=info ts=2019-09-17T10:52:28.176094333Z caller=main.go:303 build_context="(go=go1.11.5, user=root@f9f82868fc43, date=20190131-11:16:59)"
level=info ts=2019-09-17T10:52:28.176157729Z caller=main.go:304 host_details="(Linux 4.4.0-161-generic #189-Ubuntu SMP Tue Aug 27 08:10:16 UTC 2019 x86_64 prometheus-project-monitoring-0 (none))"
level=info ts=2019-09-17T10:52:28.176181532Z caller=main.go:305 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-09-17T10:52:28.176198472Z caller=main.go:306 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-09-17T10:52:28.186267231Z caller=main.go:620 msg="Starting TSDB ..."
level=info ts=2019-09-17T10:52:28.186621432Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=warn ts=2019-09-17T10:52:28.193124456Z caller=wal.go:116 component=tsdb msg="last page of the wal is torn, filling it with zeros" segment=/prometheus/wal/00000000
level=warn ts=2019-09-17T10:52:28.67254674Z caller=head.go:440 component=tsdb msg="unknown series references" count=117
level=info ts=2019-09-17T10:52:28.706308318Z caller=main.go:635 msg="TSDB started"
level=info ts=2019-09-17T10:52:28.706588081Z caller=main.go:695 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-09-17T10:52:28.712570621Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-09-17T10:52:28.713707636Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-09-17T10:52:28.714587436Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-09-17T10:52:28.739038582Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-09-17T10:52:28.741581359Z caller=main.go:722 msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-09-17T10:52:28.741622601Z caller=main.go:589 msg="Server is ready to receive web requests."
level=error ts=2019-09-17T10:52:28.755902526Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:28.756607328Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:28.786303241Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:29.762473307Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:29.762576561Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:29.792991813Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:30.764243178Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:30.764677802Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:30.840101799Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:31.765806871Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:31.840061812Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:31.841333878Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:32.767477723Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:32.842856227Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:32.842919242Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:33.240062537Z caller=notifier.go:481 component=notifier alertmanager=http://alertmanager-operated.cattle-prometheus:9093/api/v1/alerts count=0 msg="Error sending alert" err="Post http://alertmanager-operated.cattle-prometheus:9093/api/v1/alerts: dial tcp: lookup alertmanager-operated.cattle-prometheus on 10.43.0.10:53: no such host"
level=error ts=2019-09-17T10:52:33.742848324Z caller=scrape.go:147 component="scrape manager" scrape_pool=cattle-prometheus/exporter-kube-etcd-cluster-monitoring/0 msg="Error creating HTTP client" err="unable to use specified client cert (/etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-157-230-100-175.pem) & key (/etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-157-230-100-175-key.pem): open /etc/prometheus/secrets/exporter-etcd-cert/kube-etcd-157-230-100-175.pem: no such file or directory"
level=error ts=2019-09-17T10:52:33.779155358Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:33.845888675Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:33.846230718Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:34.780976165Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:34.847533467Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:34.848535733Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:35.782692275Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:35.849327219Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:35.850155678Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:36.78444725Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:36.851848374Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:36.851954569Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:37.78604279Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:37.853491008Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:37.854497758Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:38.839784233Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:38.855629086Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:38.856710216Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:39.842035279Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:39.857376552Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:39.858519611Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:40.843829884Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:40.862467658Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:40.870141011Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:41.845958524Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:41.863898811Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:41.871584181Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:42.847636173Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:302: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"pods\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:42.869155396Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:300: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"default\""
level=error ts=2019-09-17T10:52:42.938250461Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:301: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:cattle-prometheus-p-2nc9l:project-monitoring\" cannot list resource \"services\" in API group \"\" in the namespace \"default\""
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x669c12]
goroutine 437 [running]:
net/http.(*Client).deadline(0x0, 0xc005381070, 0x40bb8f, 0xc0055e3600)
/usr/local/go/src/net/http/client.go:187 +0x22
net/http.(*Client).do(0x0, 0xc005cdaa00, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/client.go:527 +0xab
net/http.(*Client).Do(0x0, 0xc005cdaa00, 0x23, 0xc002802230, 0x9)
/usr/local/go/src/net/http/client.go:509 +0x35
github.com/prometheus/prometheus/scrape.(*targetScraper).scrape(0xc0060fa960, 0x1fd4a60, 0xc00010ec60, 0x1fb2760, 0xc0002eb110, 0x0, 0x0, 0x0, 0x0)
/app/scrape/scrape.go:471 +0x111
github.com/prometheus/prometheus/scrape.(*scrapeLoop).run(0xc00616a100, 0xdf8475800, 0x2540be400, 0x0)
/app/scrape/scrape.go:813 +0x487
created by github.com/prometheus/prometheus/scrape.(*scrapePool).sync
/app/scrape/scrape.go:336 +0x45d

Only the prometheus container in the pod is crash looping.

image