prometheus-operator: targets is not responding after deleting a ServiceMonitor
What did you do?
Step to reproduce the issue:
- start prometheus
- delete a
ServiceMonitor
What did you expect to see?
targets endpoint to work properly
What did you see instead? Under which circumstances?
Prometheus stays up, but is not scraping properly and targets endpoint is not responding
Environment
-
Prometheus Operator version:
Affected version
0.27.0, we migrated from version0.23.2which was working properly -
Kubernetes version information:
kubectl version Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:38:32Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.11-eks", GitCommit:"6bf27214b7e3e1e47dce27dcbd73ee1b27adadd0", GitTreeState:"clean", BuildDate:"2018-12-04T13:33:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} -
Kubernetes cluster kind:
eksctl -
Prometheus Operator Logs:
kubectl logs prometheus-prometheus-0 -c prometheus-config-reloader -f
# how we identified the issue
level=debug ts=2019-01-22T14:02:50.24178271Z caller=reloader.go:97 msg="received watch event" op=REMOVE name=/etc/prometheus/config/prometheus.yaml
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 22 (10 by maintainers)
Sorry, it has already been upgraded in production. We’re on v2.7.1 and it seems to be working properly in that version.
I can just add that killing the problematic prometheus pod takes the full
terminationGracePeriodSecondsand still it does not shutdown properly, which also demonstrates that prometheus is stuck.