fluent-bit: Fluentbit OpenTelemetry Output Pipeline/Plugin doesnt format fields properly

Issue: When using the OpenTelemetry Output Pipeline/Plugin to send logs to an opentelemetry endpoint, the output json/payload/fields are not formatted correctly. They should be formatted according to the opentelemetry specifications. As a result, opentelemetry is unable to process the request, from fluentbit, correctly.

Fluentbit Logs showing logs being sent to opentelemtry endpoint:

[36] ebiz: [[1704434606.150000000, {}], {“message”=>“This is a test log message for abc application”, “loglevel”=>“INFO”, “service”=>“helloworld”, “clientIp”=>“111.222.888”, “timestamp”=>“2024-01-08 18:37:08.150”, “testtag”=>“fluentbit”, “trace_id”=>“7ada6c95a1bd243fa9013cab515173a9”, “span_id”=>“9c1544cc4f7ff369”}] [2024/01/08 18:37:10] [debug] [upstream] proxy returned 200 [2024/01/08 18:37:10] [debug] [http_client] flb_http_client_proxy_connect connection #32 connected to myproxy.com:8080. [2024/01/08 18:37:10] [debug] [upstream] proxy returned 200 [2024/01/08 18:37:10] [debug] [http_client] flb_http_client_proxy_connect connection #31 connected to myproxy.com:8080. [2024/01/08 18:37:10] [debug] [upstream] KA connection #32 to myproxy.com:8080 is connected [2024/01/08 18:37:10] [debug] [http_client] not using http_proxy for header [2024/01/08 18:37:10] [debug] [upstream] KA connection #31 to myproxy.com:8080 is connected [2024/01/08 18:37:10] [debug] [http_client] not using http_proxy for header [2024/01/08 18:37:10] [ info] [output:opentelemetry:opentelemetry.1] ingest.privateotel.com:443, HTTP status=200

My fluentbit configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentbit
  namespace: otel
data:
 apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentbit
  namespace: otel
data:
  custom_parsers.conf: |
        
    [MULTILINE_PARSER]
        Name          multiline-rules
        Type          regex
        Flush_timeout 2000
        rule      "start_state"   "/(\d{4}-\d{1,2}-\d{1,2})(.*)/"  "cont"
        rule      "cont"          "/^\D(.*)/"                     "cont"

    [PARSER]
        Name named-captures
        Format regex
        Regex /(?<timestamp>[^ ]* .*):(?<loglevel>DEBUG|ERROR|INFO)([\s\s]*)-\|(?<id>[\w\-]*)\|(?<clientIp>[0-9\.]*)\|(?<trace_id>[0-9A-Za-z]*)\|(?<span_id>[0-9A-Za-z]*)\|(?<message>.*)/m
        Time_key timestamp
        Time_Format %Y-%m-%d %H:%M:%S.%L
        Time_Offset -0600
        Time_Keep On

  fluent-bit.conf: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level debug
        Parsers_File /fluent-bit/etc/parsers.conf
        Parsers_File /fluent-bit/etc/custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On

    [INPUT]
        Name tail
        Log_Level error
        multiline.parser multiline-rules
        Path /app/logs/*.log
        Tag logs

    [FILTER]
        Name             parser
        Match            *
        key_name         log
        parser           named-captures

    [FILTER]
        Name             modify
        Match            *
        Add service ${SERVICE_NAME}
        
    [FILTER]
        Name             modify
        Match            *
        Add testtag fluentbit

    [OUTPUT]
        Name stdout
        Log_Level trace
        Match *

    [OUTPUT]
        Name opentelemetry
        Match *
        Log_Level trace
        Host ingest.privateotel.com
        Port 443
        Header token ***************
        Log_response_payload True
        Tls                  On
        Tls.verify           Off
        add_label            app local

  fluent-bit.conf: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level debug
        Parsers_File /fluent-bit/etc/parsers.conf
        Parsers_File /fluent-bit/etc/custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On

    [INPUT]
        Name tail
        Log_Level error
        multiline.parser multiline-rules
        Path /app/logs/*.log
        Tag logs

    [FILTER]
        Name             parser
        Match            *
        key_name         log
        parser           named-captures

    [FILTER]
        Name             modify
        Match            *
        Add service ${SERVICE_NAME}
        
    [FILTER]
        Name             modify
        Match            *
        Add testtag fluentbit

    [OUTPUT]
        Name stdout
        Log_Level trace
        Match *

    [OUTPUT]
        Name opentelemetry
        Match *
        Log_Level trace
        Host ingest.privateotel.com
        Port 443
        Header token ***************
        Log_response_payload True
        Tls                  On
        Tls.verify           Off
        add_label            app local

My opentelemetry endpoint receives the request formatted as such:


 {
    "body": {
      "clientIp": "111.222.888",
      "loglevel": "INFO",
      "message": "This is a test log message for abc application",
      "service": "helloworld",
      "span_id": "9c1544cc4f7ff369",
      "testtag": "fluentbit",
      "timestamp": "2024-01-08 18:37:08.150",
      "trace_id": "7ada6c95a1bd243fa9013cab515173a9"
    },
    "instrumentation.name": "",
    "instrumentation.version": "",
    "observed_timestamp": 0,
    "severity_text": "",
    "span_id": "",
    "trace_id": ""
  }

As you can see above every named pair gets nested under the body. The body should simply contain the log message and all the other fields I choose to send should be nested under “fields”. The proper format would looks something like what i have below.

Expected behavior:

 {
    "body": {
      "message": "This is a test log message for abc application",
    },
    "clientIp": "111.222.888",
    "loglevel": "INFO",
    "service": "helloworld",
    "instrumentation.name": "",
    "instrumentation.version": "",
    "observed_timestamp": 0,
    "testtag": "fluentbit",
    "severity_text": "",
    "span_id": "9c1544cc4f7ff369",
    "timestamp": "2024-01-08 18:37:08.150",
    "trace_id": "7ada6c95a1bd243fa9013cab515173a9"
  }

To Reproduce Use a configuration similiar to mine. Generate a log line, extract some fields from that log line including traceid, spanid, timestamp, body, etc. Then use the opentelemetry output and point it to a opentelemetry endpoint.

Your Environment I am running in a Kubernetes cluster using a daemonset and the configmap above. The version on fluentbit I am using is 2.1.10. The image is fluent/fluent-bit:2.1.10-debug

Additional context Opentelemetry is unable to correctly process all the fields such as traceid and span id since they are not in the proper format schema For more information on the opentelemetry logging data model and schema, please see below:

Otel logs data model docs

Otel logs data model github

Additional resource

Please fix the issue or if there is something I am doing incorrectly in my fluentbit config then please advise. Thank you.

About this issue

Original URL
State: closed
Created 6 months ago
Reactions: 4
Comments: 37 (34 by maintainers)

Commits related to this issue

out_opentelemetry: enhancements for log body and attributes handling (fix #8359) The following patch fix and enhance the OpenTelemetry output connector when handling log records. In Fluent Bit world... — committed to fluent/fluent-bit by edsiper 5 months ago
out_opentelemetry: enhancements for log body and attributes handling (fix #8359) The following patch fix and enhance the OpenTelemetry output connector when handling log records. In Fluent Bit world... — committed to jenningsloy318/fluent-bit by edsiper 5 months ago

Most upvoted comments

@edsiper @sudomateo

so my understanding is similar to @sudomateo. There are predefined attributes (such as spand_id, trace_id, SeverityText, etc) that should be at the SAME level as body (not within body). Body should just be the body of the log record, aka the message. Body is of type “any” so it can be a simple string or a map, but in either case it should just contain the log message.

Attributes is a map and for adding “custom” fields that are not defined in the data model. Resource is also a map and would contain info about the source of the log however fluentbit otel output plugin does not seem to yet support adding “resource” field.

This is the best document ive found on log data model is here logs-data-model. All frontend applications I have sent logs too (using flentbit) expect the logs to be structured as defined on that page. The example @sudomateo gave above would be a valid example. The only thing I would add to his example is that the body can be one of the 2 ways below, both are technically valid according to otel.

  "Body": "I like donuts"

    "Body": {
      "message": "I like donuts",
    }

I would also like to note that it seems the fluenbit otel output plugin supports the ability to add attributes, however whenever i tried to define some, it seemed they did not get added nor sent. For example, if i added the following configuration:

[OUTPUT]
    Name opentelemetry
    Match *
    Host ingest.privateotel.com
    Port 443
    add_label            status_code 500

Then I would expect my “payload” to contain an attributes section containing “status_code” : “500”, however this did not happen. But maybe the add_label field is not intended for this? If not, then im not sure what add_label is for? Also, if not, then I would suggest adding the ability to add attributes.

cb645j on Jan 23, 2024

Thank you for opening this issue! I spent the last hour or so using different filter plugins (e.g., modify, record_modifier) to no avail. I then stumbled upon this issue and was glad to know it’s not just me experiencing this. Subscribing for updates!

sudomateo on Jan 9, 2024

@kevarr Thank you. Yes, i experienced with the approach you described however I want to be able to streamline and simplify the process but just being able to use fluentbit (without a collector) to export in otel format. It seems a lot of people are interested in this. Its unfortunate that the fluentit otel_ouput does not work properly and support all of the otel fields and its very difficult to get the owners/contributors of fluentbit to make changes and respond.

I have a follow up ticket here for the things that are still missing https://github.com/fluent/fluent-bit/issues/8552

cb645j on Mar 14, 2024

A little late to the party here, but wanted to say very nice work on this. I’ve been using the Fluent output and using Otel’s fluent forward receiver which places all of the log records under the “resources” key. This is AFAICT an undocumented feature of the receiver and it also requires complex restructuring pipelines to get all of the attributes/resources in the right places.

To build off of this capability it would be awesome to have support for automatically parsing records supplied by the Kubernetes Filter into resource attributes that match those provided by OpenTelemtry’s k8sattributes processor.

kevarr on Mar 14, 2024

I can see a case for both sides here. If the event only contains one key, then it’s easy to want to assume that key must contain the event body. However, how would such an event be distinguished from a structured event of only one key? This is where I start to lean towards some new configuration option (i.e., otel_body_key) that, when set, tells Fluent Bit to use that key for the body and use all other keys as attributes. Otherwise, when unset, tells Fluent Bit to send the entire event inside the body as a structured event.

@sudomateo yes, except if the key is something like span-id or trace-id, or etc. Those should NOT go in attributes, they should be their own element (at the same level as body, attributes).

cb645j on Feb 5, 2024

@sudomateo thanks for your feedback. I came up with something similar, I am working in a POC of this logic, I will keep you posted.

edsiper on Feb 2, 2024

isn’t expected that the content be inside body or under attributes ?, the example shows the extra keys at the level of body

Thank you for looking into this @edsiper! According to the example log records in the Log Data Model the log message itself should be under body and the attributes, with the exception of trace context fields and severity fields, should be under attributes.

The example given in the linked page is:

{
  "Timestamp": 1586960586000, // JSON needs to make a decision about
                              // how to represent nanoseconds.
  "Attributes": {
    "http.status_code": 500,
    "http.url": "http://example.com",
    "my.custom.application.tag": "hello",
  },
  "Resource": {
    "service.name": "donut_shop",
    "service.version": "semver:2.0.0",
    "k8s.pod.uid": "1138528c-c36e-11e9-a1a7-42010a800198",
  },
  "TraceId": "f4dbb3edd765f620", // this is a byte sequence
                                 // (hex-encoded in JSON)
  "SpanId": "43222c2d51a7abe3",
  "SeverityText": "INFO",
  "SeverityNumber": 9,
  "Body": "20200415T072306-0700 INFO I like donuts"
}

sudomateo on Jan 23, 2024

we are taking a look at this, thanks for raising the issue

edsiper on Jan 22, 2024

We where experience the same behavior forwarding logs and metrics from FluentBit to Victoria Matrics and Logs, and VM/VL would not accept the data and discarded it !!

jarmd on Jan 9, 2024