fluent-bit: Fluent-Bit Crashes when using opentelemetry logs output (version 2.0.6)

Bug Report

Fluent bit crashes when using the opentelemetry output plugin, but only with logs. If the dummy input is omitted, it will not crash

fluent-bit-fluent-bit-1     | Fluent Bit v2.0.6
fluent-bit-fluent-bit-1     | * Copyright (C) 2015-2022 The Fluent Bit Authors
fluent-bit-fluent-bit-1     | * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
fluent-bit-fluent-bit-1     | * https://fluentbit.io
fluent-bit-fluent-bit-1     | 
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] Configuration:
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  flush time     | 1.000000 seconds
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  grace          | 5 seconds
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  daemon         | 0
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] ___________
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  inputs:
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]      node_exporter_metrics
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]      dummy
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] ___________
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  filters:
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] ___________
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  outputs:
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]      opentelemetry.0
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] ___________
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info]  collectors:
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [fluent bit] version=2.0.6, commit=211f841fe2, pid=1
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [storage] created root path /fluent-bit/run/buffer
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [storage] ver=1.3.0, type=memory+filesystem, sync=normal, checksum=off, max_chunks_up=128
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [storage] backlog input plugin: storage_backlog.2
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [cmetrics] version=0.5.7
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [ctraces ] version=0.2.5
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] initializing
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] storage_strategy='memory' (memory only)
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [node_exporter_metrics:node_exporter_metrics.0] created event channels: read=21 write=22
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.procfs = /proc
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.sysfs  = /sys
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:dummy:dummy.1] initializing
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:dummy:dummy.1] storage_strategy='memory' (memory only)
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [dummy:dummy.1] created event channels: read=23 write=24
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ warn] [input:dummy:dummy.1] data is incomplete. Use default string.
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:storage_backlog:storage_backlog.2] initializing
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:storage_backlog:storage_backlog.2] storage_strategy='memory' (memory only)
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [storage_backlog:storage_backlog.2] created event channels: read=25 write=26
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [ info] [input:storage_backlog:storage_backlog.2] queue memory limit: 95.4M
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [opentelemetry:opentelemetry.0] created event channels: read=27 write=28
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [router] match rule node_exporter_metrics.0:opentelemetry.0
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [router] match rule dummy.1:opentelemetry.0
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:58] [debug] [router] match rule storage_backlog.2:opentelemetry.0
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:59] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:59] [ info] [sp] stream processor started
fluent-bit-fluent-bit-1     | [2022/12/06 22:05:59] [debug] [input chunk] update output instances with new chunk size diff=26
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [task] created task=0x7fad3003e420 id=0 OK
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram0
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram1
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram2
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram3
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram4
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram5
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram6
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram7
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram8
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram9
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram10
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram11
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram12
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram13
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram14
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: ram15
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop0
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop1
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop2
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop3
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop4
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop5
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop6
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: loop7
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] skip device: vda1
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input chunk] update output instances with new chunk size diff=48502
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [debug] [input chunk] update output instances with new chunk size diff=26
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [engine] caught signal (SIGSEGV)
fluent-bit-fluent-bit-1     | [2022/12/06 22:06:00] [engine] caught signal (SIGSEGV)
fluent-bit-fluent-bit-1     | #0  0x7fad3710c5bd      in  ???() at ???:0
fluent-bit-fluent-bit-1     | #1  0x7fad3710eed0      in  ???() at ???:0
fluent-bit-fluent-bit-1     | #2  0x561d83f05f7b      in  flb_calloc() at include/fluent-bit/flb_mem.h:88
fluent-bit-fluent-bit-1     | #3  0x561d83f0ab9e      in  flush_to_otel() at plugins/out_opentelemetry/opentelemetry.c:668
fluent-bit-fluent-bit-1     | #4  0x561d83f0aed9      in  process_logs() at plugins/out_opentelemetry/opentelemetry.c:762
fluent-bit-fluent-bit-1     | #5  0x561d83f0bb4c      in  cb_opentelemetry_flush() at plugins/out_opentelemetry/opentelemetry.c:992
fluent-bit-fluent-bit-1     | #6  0x561d83b2b4ca      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:524
fluent-bit-fluent-bit-1     | #7  0x561d84564b06      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
fluent-bit-fluent-bit-1     | #8  0xffffffffffffffff  in  ???() at ???:0
fluent-bit-fluent-bit-1 exited with code 139

Steps to reproduce

run docker-compose up with the following files

docker-compose.yml

version: "3.7"

services:
  fluent-bit:
    image: fluent/fluent-bit:2.0.6
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
    depends_on:
      - opentelemetry

  opentelemetry:
    image: otel/opentelemetry-collector-contrib
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "1889:1888"   # pprof extension
      - "8893:8888"   # Prometheus metrics exposed by the collector
      - "8894:8889"   # Prometheus exporter metrics
      - "13134:13133" # health_check extension
      - "4322:4317"   # OTLP gRPC receiver
      - "4323:4318"   # OTLP http receiver
      - "55680:55679" # zpages extension

fluent-bit.conf

[SERVICE]
    Parsers_File    /fluent-bit/etc/parsers.conf
    HTTP_Server     On
    HTTP_Listen     0.0.0.0
    HTTP_Port       2020
    Health_Check    On
    storage.path    /fluent-bit/run/buffer
    storage.metrics on
    Log_Level       debug


[INPUT]
    Name                 node_exporter_metrics
    Tag                  node_metrics
    Scrape_interval      2


[INPUT]
    Name dummy
    Tag dummy.log
    Rate 1
    Dummy {"key": ["val1", 1, 2.3], "key2": "val2"}


[OUTPUT]
    Name                 opentelemetry
    Match                *
    Host                 opentelemetry
    Port                 4318
    Metrics_uri          /v1/metrics
    Logs_uri             /v1/logs
    Traces_uri           /v1/traces
    Log_response_payload True
    Tls                  Off
    Tls.verify           Off

otel-collector-config.yaml

receivers:
  otlp:
    protocols:
      grpc:
        include_metadata: true
      http:
        include_metadata: true

processors:
  batch:
  resource:
    attributes:
      - key: host.name
        value: "fluent bit"
        action: upsert

exporters:
  logging:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 200

service:
  telemetry:
    logs:
      level: "debug"
      encoding: json
    metrics:
      level: detailed

  pipelines:
    metrics/multi:
      receivers: [otlp]
      processors: [resource, batch]
      exporters: [logging]
    logs:
      receivers: [otlp]
      processors: [resource, batch]
      exporters: [logging]

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

Taking a look

Using the helm chart @Syn3rman with 2.0.8 and the following config

  inputs: |
    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

    [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On

  ## https://docs.fluentbit.io/manual/pipeline/filters
  filters: |
    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

  ## https://docs.fluentbit.io/manual/pipeline/outputs
  outputs: |
    [OUTPUT]
        Name opentelemetry
        Match *
        Host pmm-opentelemetry-fluent-collector-gateway
        Port 4318
        Logs_uri /v1/logs
        Log_response_payload True
        Tls Off
        Tls.verify Off
        batch_size 64

this causes crashes

lski-experiments_prometheus-adapter-ae1d64826f919bedbcbd8309e1e1a21c767ddd5e55559a2a3b4f78aa165f1e54.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=77783751 watch_fd=38 name=/var/log/containers/c-canary-os-78-server-0_ns-canary-os-78_c-canary-os-78-server-1d7f16e01400e600bee00c1df25a632c84f5b44f9cc8d5259421ad386e29aa89.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=201373101 watch_fd=39 name=/var/log/containers/c-salmon-sq-90-server-0_ns-salmon-sq-90_c-salmon-sq-90-server-09da943606a9168ce3faf78de03abcdb128b714e68f278112ec2e21ab80bd5ab.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=166880940 watch_fd=40 name=/var/log/containers/falcosidekick-ui-84f99ccfbc-6wp8c_falco_falcosidekick-ui-552e6fb1b050a39dacce80b6e16bb206b785e99758b5a23d86de5f0268b8784f.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=127783699 watch_fd=41 name=/var/log/containers/c-ivory-es-35-server-0_ns-ivory-es-35_c-ivory-es-35-server-60bbc57b187b29b7fad824c4cd3c1a5169913a455e303e3c3c2827fe0a2ce50c.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=70322923 watch_fd=42 name=/var/log/containers/falco-9q8nj_falco_falco-6f19e9dd900ccff3b942558ed9be537ab0cf67af07234de488aed0181d40bfab.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=83920747 watch_fd=43 name=/var/log/containers/falcosidekick-59c48d4dd6-cgtv2_falco_falcosidekick-76e4435098eb210eadd1a039386873d772822397372f665dd0f7a57d4bf307b7.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=61875940 watch_fd=44 name=/var/log/containers/coredns-57f98d8bd7-27wfm_kube-system_coredns-0e57b3cf6b6ce993a6dc684845094b1ac15d66f871eccd52f6694fcc3512f343.log
[2022/12/27 15:48:18] [ info] [input:tail:tail.0] inotify_fs_add(): inode=200933123 watch_fd=45 name=/var/log/containers/fluent-bit-rgrqm_open-telemetry_fluent-bit-6f11f3cd32c6233c76ccac8d28c8783fa5d1e8231cdd22d3621b56b913ac1af2.log
[2022/12/27 15:48:18] [engine] caught signal (SIGSEGV)
#0  0x562f598f3c94      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:713
#1  0x562f598f36b0      in  required_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:455
#2  0x562f598f3951      in  unlabeled_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:593
#3  0x562f598f3dc6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:739
#4  0x562f598f3bfd      in  repeated_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:675
#5  0x562f598f3de6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:744
#6  0x562f598f36b0      in  required_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:455
#7  0x562f598f376f      in  oneof_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:492
#8  0x562f598f3d77      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:727
#9  0x562f598f36b0      in  required_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:455
#10 0x562f598f3951      in  unlabeled_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:593
#11 0x562f598f3dc6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:739
#12 0x562f598f3bfd      in  repeated_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:675
#13 0x562f598f3de6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:744
#14 0x562f598f36b0      in  required_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:455
#15 0x562f598f376f      in  oneof_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:492
#16 0x562f598f3d77      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:727
#17 0x562f598f36b0      in  required_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:455
#18 0x562f598f3951      in  unlabeled_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:593
#19 0x562f598f3dc6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:739
#20 0x562f598f3bfd      in  repeated_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:675
#21 0x562f598f3de6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:744
#22 0x562f598f3bfd      in  repeated_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:675
#23 0x562f598f3de6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:744
#24 0x562f598f3bfd      in  repeated_field_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:675
#25 0x562f598f3de6      in  protobuf_c_message_get_packed_size() at lib/fluent-otel-proto/proto_c/protobuf-c/protobuf-c.c:744
#26 0x562f598fb9b6      in  opentelemetry__proto__collector__logs__v1__export_logs_service_request__get_packed_size() at lib/fluent-otel-proto/proto_c/opentelemetry/proto/collector/logs/v1/logs_service.pb-c.c:20
#27 0x562f59737c88      in  flush_to_otel() at plugins/out_opentelemetry/opentelemetry.c:671
#28 0x562f597380b9      in  process_logs() at plugins/out_opentelemetry/opentelemetry.c:794
#29 0x562f59738df1      in  cb_opentelemetry_flush() at plugins/out_opentelemetry/opentelemetry.c:1029
#30 0x562f5934f4a0      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:527
#31 0x562f59d92266      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#32 0xffffffffffffffff  in  ???() at ???:0

It is hard coded but I will open a PR to move it to config param