prometheus: ErrInvalidSample is returned on duplicate labels even when the value is also duplicated

This is clearly a bug in DC/OS / Mesos, but it caused a regression when ignoring the duplicate label (as long as the value is also duplicated) is the correct thing to do.

What did you do? Upgraded from Prometheus v2.14.0 to v2.16.0.

What did you expect to see? Successful parsing of DC/OS metrics on <IP>:61091/metrics.

What did you see instead? Under which circumstances? Prometheus now returns ErrInvalidSample ‘label name “container_id” is not unique’, marks the target as down and fails to record metrics.

Environment DC/OS OSS 1.11.6

  • System information: Linux 4.19.95-coreos x86_64

  • Prometheus version: Before upgrade (working): prometheus, version 2.14.0 (branch: HEAD, revision: edeb7a44cbf745f1d8be4ea6f215e79e651bfe19) build user: root@df2327081015 build date: 20191111-14:27:12 go version: go1.13.4

After upgrade (duplicate label errors): prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec) build user: root@7ea0ae865f12 build date: 20200213-23:50:02 go version: go1.13.8

  • Alertmanager version: N/A

  • Prometheus configuration file:

...
- job_name: master-metrics
  honor_timestamps: true
  scrape_interval: 1m
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  dns_sd_configs:
  - names:
    - master.mesos
    refresh_interval: 30s
    type: A
    port: 61091
...
  • Alertmanager configuration file: N/A

  • Logs:

level=warn ts=2020-02-20T22:52:46.911Z caller=scrape.go:972 component="scrape manager" scrape_pool=agent-metrics target=http://10.30.32.205:61091/metrics msg="append failed" err="label name \"container_id\" is not unique: invalid sample"
  • Metric The container_id label is duplicated, but so is the value (violent agreement).
# HELP net_rx_packets DC/OS Metrics Datapoint
# TYPE net_rx_packets gauge
net_rx_packets{cluster_id="33bced63-a344-4664-acac-c8f043f91da6",container_id="256aaa06-38d2-4164-b7ec-8619b628731f",container_id="256aaa06-38d2-4164-b7ec-8619b628731f",dcos_package_is_framework="false",dcos_package_name="cadvisor",dcos_package_version="0.3.0-0.27.2",dcos_service_name="cadvisor",executor_id="cadvisor.1f72bcf3-21e2-11ea-b36c-82d64e6d50ed",executor_id="cadvisor.1f72bcf3-21e2-11ea-b36c-82d64e6d50ed",executor_name="Command Executor (Task: cadvisor.1f72bcf3-21e2-11ea-b36c-82d64e6d50ed) (Command: sh -c '/usr/bin/cad...')",framework_id="0a9b8664-f7f6-41a4-93b1-01493ff62a49-0000",framework_id="0a9b8664-f7f6-41a4-93b1-01493ff62a49-0000",framework_name="marathon",framework_principal="dcos_marathon",framework_role="slave_public",hostname="10.30.35.218",mesos_id="1d24d8d1-52b7-4694-812c-d20a953180f9-S17",source="cadvisor.1f72bcf3-21e2-11ea-b36c-82d64e6d50ed",task_id="cadvisor.1f72bcf3-21e2-11ea-b36c-82d64e6d50ed",task_name="cadvisor"} 0

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (12 by maintainers)

Most upvoted comments

You are right.

A better workaround:

    metric_relabel_configs:
      - action: labelmap

NOTE: This is a workaround for John. I do not recommend running this for people who face the ‘label name "" is not unique: invalid sample’ error. Instead, you should fix the problematic exporter.