opentelemetry-collector-contrib: [exporter/prometheus] Expired metrics were not be deleted
Component(s)
exporter/prometheus
What happened?
Description
Hi, I am trying to use spanmetrics processor and prometheus exporter to transform spans to metrics. But I found some expired metrics seems to be appeared repeatedly when new metrics received. And also the memory usage continued to rise. So is this a bug of prometheus exporter?
Steps to Reproduce
examples: when I posted span to collector, the prometheus exporter exported the metric like this
calls_total{db_instance="N/A",db_name="name-KSDORKdOKV",db_sql_table="table-IWstkE",db_system="redis",operation="get",service_name="go-project.examples",span_kind="SPAN_KIND_CLIENT",status_code="STATUS_CODE_UNSET"} 1
after 5 seconds, the metric will disappear. Then I posted another span to collector, the prometheus exporter exported two metrics included the expired one
calls_total{db_instance="N/A",db_name="name-KSDORKdOKV",db_sql_table="table-IWstkE",db_system="redis",operation="get",service_name="go-project.examples",span_kind="SPAN_KIND_CLIENT",status_code="STATUS_CODE_UNSET"} 1
calls_total{db_instance="N/A",db_name="name-QSvMJKDYso",db_sql_table="table-ZHdGvF",db_system="redis",operation="set",service_name="go-project.examples",span_kind="SPAN_KIND_CLIENT",status_code="STATUS_CODE_UNSET"} 1
Expected Result
Expired metrics will be deleted.
Actual Result
Expired metrics seems to be still stored in memory cache.
Collector version
2ada50fd4a
Environment information
Environment
OS: MacOS 13.0.1 Compiler(if manually compiled): go 1.19.3
OpenTelemetry Collector configuration
receivers:
# Dummy receiver that's never used, because a pipeline is required to have one.
otlp/spanmetrics:
protocols:
grpc:
endpoint: "localhost:12345"
otlp:
protocols:
grpc:
endpoint: "localhost:55677"
processors:
batch:
spanmetrics:
metrics_exporter: otlp/spanmetrics
latency_histogram_buckets: [10ms, 100ms]
dimensions:
- name: db.system
default: N/A
- name: db.name
default: N/A
- name: db.sql.table
default: N/A
- name: db.instance
default: N/A
dimensions_cache_size: 1000
aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
exporters:
logging:
verbosity: basic
otlp/spanmetrics:
endpoint: "localhost:55677"
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
metric_expiration: 5s
service:
pipelines:
traces:
receivers: [otlp]
processors: [spanmetrics, batch]
exporters: [logging]
# The exporter name must match the metrics_exporter name.
# The receiver is just a dummy and never used; added to pass validation requiring at least one receiver in a pipeline.
metrics/spanmetrics:
receivers: [otlp/spanmetrics]
exporters: [otlp/spanmetrics]
metrics:
receivers: [otlp]
exporters: [prometheus]
Log output
No response
Additional context
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 16 (14 by maintainers)
Hi @mfilipe , the spanmetrics connector will continue to emit metrics at a configurable time interval (default every 15s) even when no new spans are sent to it, so I believe this is what’s preventing the
prometheus.metric_expiration
to kick in.There are some knobs available to influence the number of metrics the spanmetrics connector emits which are
dimensions_cache_size
and thebuckets
config parameters.The former ensures no more than this many metrics will be store in memory and emitted downstream.
The latter speaks to the resolution of histograms, so a smaller number should reduced the number of metrics exported.