alertmanager: OpsGenie: Empty description causes status code 400 and thus blocks notification even for later alert with non-empty description

Whenever I restart alertmanager, my log is flooded with the following messages:

time="2016-03-09T01:14:14+01:00" level=warning msg="Notify attempt 11 failed: unexpected status code 400" source="notify.go:193"

Alertmanager quickly backs off, but continues to spam the log with these warnings. This is without any active alerts. I think they are leftovers from past opsgenie alerts.

Proposed fix: Alertmanager should cancel notifications when (or shortly after) the alert that triggered them is closed.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 1
  • Comments: 27 (15 by maintainers)

Most upvoted comments

@thenayr - You can iterate over the individual alert instances using Go templating - here’s an example that creates …rather verbose output:

text: >-
  {{ range .Alerts }}
      Summary: {{ .Annotations.summary }}
      Description: {{ .Annotations.description }}
      Playbook: {{ .Annotations.playbook }}
      Graph: {{ .GeneratorURL }}
      Details:
      {{ range .Labels.SortedPairs }} - {{ .Name }} = {{ .Value }}
      {{ end }}
  {{ end }}

I’ve left this response here since this issue seems to attract questions about this topic, but I think maybe we should do further discussions about this topic on https://groups.google.com/forum/#!forum/prometheus-developers or perhaps even the IRC channel, see https://prometheus.io/community/. =)