alertmanager: repeat_interval doesn't work

below is the alertmanager config file

lobal:
  resolve_timeout: 5m
route:
  group_by: ['alertname','cluster','host_area']
  group_wait: 5s
  group_interval: 5m
  repeat_interval: 3h
  receiver: 'sreserver'
  routes:
  - receiver: 'k8s'
    group_by: ['alertname','cluster']
    matchers:
      - cluster =~ ".*-k8s"

  - receiver: 'dbserver'
    group_by: ['alertname','host_area']
    matchers:
      - host_role =~ "dbserver|DBserver"

  - receiver: 'BIserver'
    group_by: ['alertname','host_area']
    matchers:
      - host_role = "BIserver"

  - receiver: 'sreserver'
    group_by: ['alertname','host_area']
    matchers:
      - host_role != "BIserver"
      - host_role !~ "dbserver|DBserver"

receivers:
  - name: 'sreserver'
    webhook_configs:
              - url: "xxxxxxxxxx"

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'instance']

We use vmalert to examine alert rules and send alerts to Alertmanager…for example ,below is a rule about Mem

- alert: 内存使用率
      expr: instance:node_memory_utilisation:ratio * 100 > 95
      for: 1m
      labels:
        severity: warning
      annotations:
        summary: "内存使用率过高!"
        description: "内存使用大于95%(目前使用:{{$value}}%)"

But the question is : Why I receive alerts with this alertname every 5 minutes? Should’t be 3 hours ?

  • Alertmanager version:
alertmanager-0.24.0
  • Prometheus version:
prometheus-2.37.6

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (4 by maintainers)

Most upvoted comments

Hi! 👋 You are correct that your repeat_interval is 3h, but this only applies when the alerts in the group haven’t changed since the last notification. If the alerts in the group have changed since the last notification (because new alerts are fired or others are resolved) then your alerts will “repeat” every 5m, as you have observed.

If you are confident that the alerts in the group aren’t changing then you might want to run Alertmanager with -log.level=debug just to make sure that you don’t have a flapping alert.