opentelemetry-collector-contrib: Memory leak problem with Opentelemetry Collector
Describe the bug Memory leak problem with Opentelemetry Collecotor
Steps to reproduce I wasn’t able to reproduce this locally, but I think it may be due to the fact that OTLP collected a huge trace with 20000 spans.
What did you expect to see? Expected memory usage to go up and down. However, memory usage is constantly going up.
What version did you use? opentelemetry-operator:0.37.1 tempo-distributed:1.5.4
What config did you use?
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel
namespace: opentelemetry
spec:
config: |
connectors:
spanmetrics:
namespace: span.metrics
receivers:
# Data sources: traces, metrics, logs
otlp:
protocols:
grpc:
http:
processors:
memory_limiter:
check_interval: 1s
limit_percentage: 75
spike_limit_percentage: 15
batch:
send_batch_size: 10000
timeout: 10s
tail_sampling:
policies:
- name: drop_noisy_traces_url
type: string_attribute
string_attribute:
key: http.target
values:
- \/health
enabled_regex_matching: true
invert_match: true
exporters:
otlp:
endpoint: http://tempo-distributor:4317/
tls:
insecure: true
logging:
loglevel: debug
prometheus:
enable_open_metrics: true
endpoint: 0.0.0.0:8889
resource_to_telemetry_conversion:
enabled: true
loki:
endpoint: http://loki-gateway.loki/loki/api/v1/push
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, tail_sampling]
exporters: [otlp, spanmetrics]
metrics:
receivers: [otlp, spanmetrics]
processors: [memory_limiter, batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [loki]
Environment OS: AKS Ubuntu Linux Compiler: .NET 6.0 dotnet-autoinstrumentation
About this issue
- Original URL
- State: open
- Created 7 months ago
- Comments: 15 (7 by maintainers)
@albertteoh yes, it was related to exemplars. OTLP pod did not crash due to memory since 17 hours. Thanks
I am transferring this to contrib since the current theory is that this is related to the spanmetrics connector
I noticed you’re using the spanmetrics connector; there was a recent merge of a memory leak fix: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/28847
It was just released today: https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.91.0
It might be worth upgrading the opentelemetry-operator once it’s released with collector v0.91.0.