prometheus-operator: targets is not responding after deleting a ServiceMonitor

What did you do?

Step to reproduce the issue:

  • start prometheus
  • delete a ServiceMonitor

What did you expect to see?

targets endpoint to work properly

What did you see instead? Under which circumstances?

Prometheus stays up, but is not scraping properly and targets endpoint is not responding

Environment

  • Prometheus Operator version:

    Affected version 0.27.0, we migrated from version 0.23.2 which was working properly

  • Kubernetes version information:

      kubectl version
      Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:38:32Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}
      Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.11-eks", GitCommit:"6bf27214b7e3e1e47dce27dcbd73ee1b27adadd0", GitTreeState:"clean", BuildDate:"2018-12-04T13:33:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
    
  • Kubernetes cluster kind:

     eksctl
    
  • Prometheus Operator Logs:

kubectl logs prometheus-prometheus-0 -c prometheus-config-reloader -f

# how we identified the issue
level=debug ts=2019-01-22T14:02:50.24178271Z caller=reloader.go:97 msg="received watch event" op=REMOVE name=/etc/prometheus/config/prometheus.yaml

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (10 by maintainers)

Most upvoted comments

Sorry, it has already been upgraded in production. We’re on v2.7.1 and it seems to be working properly in that version.

I can just add that killing the problematic prometheus pod takes the full terminationGracePeriodSeconds and still it does not shutdown properly, which also demonstrates that prometheus is stuck.