prometheus-msteams: level=error msg="Failed to parse json with key 'sections': Key path not found"

Hey,

Having issues with payloads from prometheus being sent to msteams. We’ve recently updated to the latest prometheus/alertmanager and version 1.1.4 of promethus-msteams.

Test alerts work fine, so its an error in the formatting which is coming out of default prometheus-operator helm chart install. I haven’t looked too much further yet, but can see a few people do have templating errors. Given 1.1.4 has just been released I thought i’d raise an issue as well.

Template is the default one, unedited - and this is the error being received in the logs

time="2019-08-05T07:42:15Z" level=debug msg="Prometheus Alert: {\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"annotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"startsAt\":\"2019-08-05T07:36:45.470372903Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://prometheus-dashboard.testing/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{},\"commonLabels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"commonAnnotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"externalURL\":\"http://alertmanager.testing\",\"version\":\"4\",\"groupKey\":\"{}:{}\"}"
time="2019-08-05T07:42:15Z" level=debug msg="Alert rendered in template file: \n{\n  \"@type\": \"MessageCard\",\n  \"@context\": \"http://schema.org/extensions\",\n  \"themeColor\": \"808080\",\n  \"summary\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\",\n  \"title\": \"Prometheus Alert (firing)\",\n  \"sections\": [ \n    {\n      \"activityTitle\": \"[ aaaa](http://alertmanager.testing)\",\n      \"facts\": [\n        {\n          \"name\": \"message\",\n          \"value\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\"\n        },\n        {\n          \"name\": \"alertname\",\n          \"value\": \"Watchdog\"\n        },\n        {\n          \"name\": \"prometheus\",\n          \"value\": \"monitoring/prometheus-operator-prometheus\"\n        },\n        {\n          \"name\": \"severity\",\n          \"value\": \"none\"\n        }\n      ],\n      \"markdown\": true\n    }\n  ]\n}\n"
time="2019-08-05T07:42:15Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-08-05T07:42:15Z" level=error msg="Failed to parse json with key 'sections': Key path not found"

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 41 (6 by maintainers)

Most upvoted comments

Sorry for the delay. Needed to check with workplace before posting. Here’s the revised template we are using. Perhaps it can be added as an alternative to default template.

Known gotchas:

  • GeneratorURL is not useful
  • assumes that “Annotations” and “Labels” exist (will fail to render if not exist)
  • still not enough validation
{{ define "teams.card" }}
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "firing" -}}
    {{- if eq .CommonLabels.severity "critical" -}}8C1A1A
    {{- else if eq .CommonLabels.severity "warning" -}}FFA500
    {{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "Prometheus Alert ({{ .Status }})",
"title": "Prometheus Alert ({{ .Status }})",
"sections": [ {{$externalUrl := .ExternalURL}}
    {{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
    {
    "activityTitle":
    {{- if ne $alert.Annotations.description "" -}}
        "[{{ js $alert.Annotations.description }}]({{ $externalUrl }})",
    {{- else -}}
        "[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",
    {{- end -}}
    "facts": [
        { "name": "Status", "value": "{{ .Status }}" },
        { "name": "StartsAt", "value": "{{ js .StartsAt }}" },
        {{- if and .EndsAt  ( not .EndsAt.IsZero ) }}
        { "name": "EndsAt", "value": "{{ js .EndsAt }}" },
        {{- end}}
        { "name": "ExternalURL", "value": "{{ js $externalUrl }}" },
        { "name": "GeneratorURL", "value": "{{ js .GeneratorURL }}" },

        {{- range $key, $value := $alert.Annotations }}
            {
            "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
            "value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
            },
        {{- end -}}
        {{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
            {
            "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
            "value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
            }
        {{- end }}
    ],
    "markdown": true
    }
    {{- end }}
]
}
{{ end }}

If time permits I will prepare proper PRs for this and possibly inclusion of sprig template function library. Helm uses sprig to make templating more concise and harder to “break”. I am afraid it may take a while due to personal time constraints.

Many thanks for prometheus-msteams project !

I’ve opened a PR to modify the default template to handle newline and single quote pls review: https://github.com/bzon/prometheus-msteams/pull/77

@nickadams675 the “greater than” in your alert description is triggering this: "description\":\"CPU load is > 80%\\n

I will look for a solution, but until we get one as a workaround you may want to use for your description: "CPU load greater than 80%"

the js function in the template is replacing > with \\x3E causing the issue

Go template ‘js’ function seems do produce quoting which works correctly with msteams:

Example:

"[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",

I spent a bit more time looking at this today, and it seems it doesn’t like the line breaks.

I modified the message here from the template to be a single line and removed the " as well to ensure it was ok… and it works now - so looks like its not handling multiple line alerts/messages?

I used this website to help out https://jsonformatter.curiousconcept.com/

helm/prometheus-operator/templates/prometheus/rules/general.rules.yaml
@@ -29,17 +29,7 @@ spec:
         severity: warning
     - alert: Watchdog
       annotations:
-        message: 'This is an alert meant to ensure that the entire alerting pipeline is functional.
-
-          This alert is always firing, therefore it should always be firing in Alertmanager
-
-          and always fire against a receiver. There are integrations with various notification
-
-          mechanisms that send a notification when this alert is not firing. For example the
-
-          DeadMansSwitch integration in PagerDuty.
-
-          '
+        message: 'This is an alert meant to ensure that the entire alerting pipeline is functional. This alert is always firing, therefore it should always be firing in Alertmanager and
       expr: vector(1)