opentelemetry-collector: Collector constantly breaking down

Describe the bug After a while the pod, on which the collector runs, stops with this error:

panic: runtime error: slice bounds out of range [-2:]
32
31
goroutine 137 [running]:
30
go.opentelemetry.io/collector/pdata/internal/data/protogen/common/v1.(*AnyValue).MarshalToSizedBuffer(0xc002dfd810, {0xc001e84000, 0x22, 0x30c16})
29
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/common/v1/common.pb.go:483 +0xd0
28
go.opentelemetry.io/collector/pdata/internal/data/protogen/common/v1.(*KeyValue).MarshalToSizedBuffer(0xc002dfd800, {0xc001e84000, 0x22, 0x30c16})
27
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/common/v1/common.pb.go:700 +0x3a
26
go.opentelemetry.io/collector/pdata/internal/data/protogen/resource/v1.(*Resource).MarshalToSizedBuffer(0xc000753360, {0xc001e84000, 0x30984?, 0x30c16})
25
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/resource/v1/resource.pb.go:146 +0xf0
24
go.opentelemetry.io/collector/pdata/internal/data/protogen/trace/v1.(*ResourceSpans).MarshalToSizedBuffer(0xc000753360, {0xc001e84000, 0x30984, 0x30c16})
23
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/trace/v1/trace.pb.go:890 +0x105
22
go.opentelemetry.io/collector/pdata/internal/data/protogen/collector/trace/v1.(*ExportTraceServiceRequest).MarshalToSizedBuffer(0xc003408198, {0xc001e84000, 0x30c16, 0x30c16})
21
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/collector/trace/v1/trace_service.pb.go:351 +0xac
20
go.opentelemetry.io/collector/pdata/internal/data/protogen/collector/trace/v1.(*ExportTraceServiceRequest).Marshal(0xc0cebc8861798197?)
19
go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/collector/trace/v1/trace_service.pb.go:331 +0x56
18
go.opentelemetry.io/collector/pdata/ptrace/ptraceotlp.Request.MarshalProto(...)
17
go.opentelemetry.io/collector/pdata@v0.62.1/ptrace/ptraceotlp/traces.go:88
16
go.opentelemetry.io/collector/exporter/otlphttpexporter.(*exporter).pushTraces(0xc0001255f0, {0x7388850, 0xc003291470}, {0xc002756e80?})
15
go.opentelemetry.io/collector@v0.62.1/exporter/otlphttpexporter/otlp.go:99 +0x32
14
go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesRequest).Export(0x279293e?, {0x7388850?, 0xc003291470?})
13
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/traces.go:70 +0x34
12
go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send(0xc000d34750, {0x73a7158, 0xc00340ac30})
11
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/common.go:203 +0x96
10
go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send(0xc000125680, {0x73a7158, 0xc00340ac30})
9
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/queued_retry.go:388 +0x58d
8
go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send(0xc000de0e88, {0x73a7158, 0xc00340ac30})
7
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/traces.go:134 +0x88
6
go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1({0x73a7158, 0xc00340ac30})
5
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/queued_retry.go:206 +0x39
4
go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers.func1()
3
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/internal/bounded_memory_queue.go:61 +0xb6
2
created by go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers
1
go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/internal/bounded_memory_queue.go:56 +0x45

Steps to reproduce Using otel/opentelemetry-collector-contrib:0.62.1 docker image and config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ecom-opentelemetry-collector
  labels:
    helm.sh/chart: opentelemetry-collector-0.30.0
    app.kubernetes.io/name: ecom-opentelemetry-collector
    app.kubernetes.io/instance: ecom-dev
    app.kubernetes.io/version: "0.59.0"
    app.kubernetes.io/managed-by: Helm
data:
  relay: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
      otlp/spanmetrics:
        protocols:
          grpc:
            endpoint: localhost:12345
    processors:
      batch: {}
      spanmetrics:
        metrics_exporter: otlp/spanmetrics
        dimensions_cache_size: 5000
        latency_histogram_buckets:
          - 10ms
          - 100ms
          - 1s
          - 2s
          - 4s
          - 8s
          - 16s
          - 32s
        aggregation_temporality: AGGREGATION_TEMPORALITY_CUMULATIVE        
        dimensions:
          - name: http.status_code
          - name : target_xpath
          - name : some_more_stuff
    exporters:
      logging:
        loglevel: debug
      otlphttp:
        endpoint: http://jaeger-collector.jaeger.svc:4318
        tls:
          insecure: true
        sending_queue:
          num_consumers: 4
          queue_size: 100
        retry_on_failure:
          enabled: true
      zipkin:
        endpoint: http://jaeger-collector.jaeger.svc:9411/api/v2/spans
        tls:
          insecure: true
        sending_queue:
          num_consumers: 4
          queue_size: 100
        retry_on_failure:
          enabled: true
      otlp/spanmetrics:
        endpoint: 127.0.0.1:4317
        tls:
          insecure: true
      prometheus:
        endpoint: 0.0.0.0:8889
        namespace: default
    service:
      extensions:
      - health_check
      telemetry:
        logs:
          level: debug
        metrics:
          level: detailed
          address: 0.0.0.0:8888
      pipelines:
        logs:
          receivers:
          - otlp
          processors:
          - batch
          exporters:
          - logging
        traces:
          receivers:
          - otlp
          processors:
          - spanmetrics
          - batch
          exporters:
          - otlphttp
          - logging
        metrics:
          receivers:
          - otlp
          processors:
          - batch
          exporters:
          - logging
          - prometheus
        metrics/spanmetrics:
          receivers:
            - otlp/spanmetrics
          exporters:
            - otlp/spanmetrics
    extensions:
      health_check: {}

Environment OS: linux docker on k8s

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 24 (11 by maintainers)

Commits related to this issue

Most upvoted comments

@ambition-consulting @andretong @Edition-X found the bug, will submit a fix soon. In the meantime if you want a quick fix, remove the logging exporter from the pipelines, or do not configure loglevel: debug.

Hello everyone! I’m having the same issue with both versions, v0.63.0 and v0.62.1

As an extra, I’m using the Attributes Span Processor feature

The only thing I could notice prior to the error is that is trying to process a trace with at least 26 spans and then it crashes. Here I attach the stack trace of the error

docker-compose-otel-collector-1  | 	{"kind": "exporter", "data_type": "traces", "name": "logging"}
docker-compose-otel-collector-1  | panic: runtime error: slice bounds out of range [-3:]
docker-compose-otel-collector-1  | 
docker-compose-otel-collector-1  | goroutine 99 [running]:
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/common/v1.(*KeyValue).MarshalToSizedBuffer(0xc0014e3c00, {0xc001700000, 0x9, 0x2675})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/common/v1/common.pb.go:711 +0x21b
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/resource/v1.(*Resource).MarshalToSizedBuffer(0xc0011dda90, {0xc001700000, 0x264d?, 0x2675})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/resource/v1/resource.pb.go:146 +0xf0
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/trace/v1.(*ResourceSpans).MarshalToSizedBuffer(0xc0011dda90, {0xc001700000, 0x2675, 0x2675})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/trace/v1/trace.pb.go:890 +0x105
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/collector/trace/v1.(*ExportTraceServiceRequest).MarshalToSizedBuffer(0xc00151ec90, {0xc001700000, 0x2675, 0x2675})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/collector/trace/v1/trace_service.pb.go:351 +0xac
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/collector/trace/v1.(*ExportTraceServiceRequest).Marshal(0xc00151ec90?)
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/collector/trace/v1/trace_service.pb.go:331 +0x56
docker-compose-otel-collector-1  | google.golang.org/protobuf/internal/impl.legacyMarshal({{}, {0x73ba118, 0xc00166e7f0}, {0x0, 0x0, 0x0}, 0x0})
docker-compose-otel-collector-1  | 	google.golang.org/protobuf@v1.28.1/internal/impl/legacy_message.go:402 +0xa2
docker-compose-otel-collector-1  | google.golang.org/protobuf/proto.MarshalOptions.size({{}, 0x90?, 0xec?, 0x51?}, {0x73ba118, 0xc00166e7f0})
docker-compose-otel-collector-1  | 	google.golang.org/protobuf@v1.28.1/proto/size.go:43 +0xa6
docker-compose-otel-collector-1  | google.golang.org/protobuf/proto.MarshalOptions.Size({{}, 0xc0?, 0x5a?, 0x40?}, {0x733b780?, 0xc00166e7f0?})
docker-compose-otel-collector-1  | 	google.golang.org/protobuf@v1.28.1/proto/size.go:26 +0x54
docker-compose-otel-collector-1  | google.golang.org/protobuf/proto.Size(...)
docker-compose-otel-collector-1  | 	google.golang.org/protobuf@v1.28.1/proto/size.go:16
docker-compose-otel-collector-1  | github.com/golang/protobuf/proto.Size({0x7f387be10158?, 0xc00151ec90?})
docker-compose-otel-collector-1  | 	github.com/golang/protobuf@v1.5.2/proto/wire.go:18 +0x45
docker-compose-otel-collector-1  | go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.messageType.Event({{0x6716adf, 0xc}, {0x4, 0x0, {0x66f7565, 0x4}, {0x0, 0x0}}}, {0x7388850, 0xc0020cf6b0}, ...)
docker-compose-otel-collector-1  | 	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.36.1/interceptor.go:50 +0x165
docker-compose-otel-collector-1  | go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryClientInterceptor.func1({0x7388850, 0xc0020cf5c0}, {0x686ead8, 0x3b}, {0x6405ac0, 0xc00151ec90}, {0x6405c00, 0xc00012fe68}, 0xc0008fe000, 0x69a8358, ...)
docker-compose-otel-collector-1  | 	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.36.1/interceptor.go:106 +0x6aa
docker-compose-otel-collector-1  | google.golang.org/grpc.(*ClientConn).Invoke(0xc0008fe000?, {0x7388850?, 0xc0020cf5c0?}, {0x686ead8?, 0x3b?}, {0x6405ac0?, 0xc00151ec90?}, {0x6405c00?, 0xc00012fe68?}, {0xc0011e7150, ...})
docker-compose-otel-collector-1  | 	google.golang.org/grpc@v1.50.0/call.go:35 +0x223
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/internal/data/protogen/collector/trace/v1.(*traceServiceClient).Export(0xc0013d0248, {0x7388850, 0xc0020cf5c0}, 0xc0011e5a70?, {0xc0011e7150, 0x1, 0x1})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/internal/data/protogen/collector/trace/v1/trace_service.pb.go:271 +0xc9
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/pdata/ptrace/ptraceotlp.(*tracesClient).Export(0x49cc20?, {0x7388850?, 0xc0020cf5c0?}, {0xc0020cf590?}, {0xc0011e7150?, 0x0?, 0x0?})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector/pdata@v0.62.1/ptrace/ptraceotlp/traces.go:140 +0x30
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/otlpexporter.(*exporter).pushTraces(0xc00116f860, {0x7388818?, 0xc0011ce9c0?}, {0x7388850?})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/otlpexporter/otlp.go:105 +0x69
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesRequest).Export(0x7388850?, {0x7388818?, 0xc0011ce9c0?})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/traces.go:70 +0x34
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send(0xc00119d4a0, {0x73a7158, 0xc0020cf560})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/common.go:203 +0x96
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send(0xc000bf27e0, {0x73a7158, 0xc0020cf560})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/queued_retry.go:388 +0x58d
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send(0xc0011e2420, {0x73a7158, 0xc0020cf560})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/traces.go:134 +0x88
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1({0x73a7158, 0xc0020cf560})
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/queued_retry.go:206 +0x39
docker-compose-otel-collector-1  | go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers.func1()
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/internal/bounded_memory_queue.go:61 +0xb6
docker-compose-otel-collector-1  | created by go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers
docker-compose-otel-collector-1  | 	go.opentelemetry.io/collector@v0.62.1/exporter/exporterhelper/internal/bounded_memory_queue.go:56 +0x45
docker-compose-otel-collector-1 exited with code 2

@ambition-consulting still investigating, is this the first version you saw this? Have you run 61 successfully without any error?

Updated: Do you see this with v0.63.0 as well or just with v0.62.1?

yes with both - those are also the only versions I have tested.

Also, can you run the collector with this configuration for the pipelines, to isolate the problem:

      pipelines:
        logs:
          receivers:
          - otlp
          processors:
          - batch
          exporters:
          - logging
        traces:
          receivers:
          - otlp
          processors:
          - batch
          exporters:
          - otlphttp
        traces/spanmetrics:
          receivers:
          - otlp
          processors:
          - spanmetrics
          exporters:
          - logging
        metrics:
          receivers:
          - otlp
          processors:
          - batch
          exporters:
          - logging
          - prometheus
        metrics/spanmetrics:
          receivers:
            - otlp/spanmetrics
          exporters:
            - otlp/spanmetrics