fluent-bit: Fluentbit OpenTelemetry Output Pipeline/Plugin doesnt format fields properly
Issue: When using the OpenTelemetry Output Pipeline/Plugin to send logs to an opentelemetry endpoint, the output json/payload/fields are not formatted correctly. They should be formatted according to the opentelemetry specifications. As a result, opentelemetry is unable to process the request, from fluentbit, correctly.
Fluentbit Logs showing logs being sent to opentelemtry endpoint:
[36] ebiz: [[1704434606.150000000, {}], {“message”=>“This is a test log message for abc application”, “loglevel”=>“INFO”, “service”=>“helloworld”, “clientIp”=>“111.222.888”, “timestamp”=>“2024-01-08 18:37:08.150”, “testtag”=>“fluentbit”, “trace_id”=>“7ada6c95a1bd243fa9013cab515173a9”, “span_id”=>“9c1544cc4f7ff369”}] [2024/01/08 18:37:10] [debug] [upstream] proxy returned 200 [2024/01/08 18:37:10] [debug] [http_client] flb_http_client_proxy_connect connection #32 connected to myproxy.com:8080. [2024/01/08 18:37:10] [debug] [upstream] proxy returned 200 [2024/01/08 18:37:10] [debug] [http_client] flb_http_client_proxy_connect connection #31 connected to myproxy.com:8080. [2024/01/08 18:37:10] [debug] [upstream] KA connection #32 to myproxy.com:8080 is connected [2024/01/08 18:37:10] [debug] [http_client] not using http_proxy for header [2024/01/08 18:37:10] [debug] [upstream] KA connection #31 to myproxy.com:8080 is connected [2024/01/08 18:37:10] [debug] [http_client] not using http_proxy for header [2024/01/08 18:37:10] [ info] [output:opentelemetry:opentelemetry.1] ingest.privateotel.com:443, HTTP status=200
My fluentbit configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentbit
namespace: otel
data:
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentbit
namespace: otel
data:
custom_parsers.conf: |
[MULTILINE_PARSER]
Name multiline-rules
Type regex
Flush_timeout 2000
rule "start_state" "/(\d{4}-\d{1,2}-\d{1,2})(.*)/" "cont"
rule "cont" "/^\D(.*)/" "cont"
[PARSER]
Name named-captures
Format regex
Regex /(?<timestamp>[^ ]* .*):(?<loglevel>DEBUG|ERROR|INFO)([\s\s]*)-\|(?<id>[\w\-]*)\|(?<clientIp>[0-9\.]*)\|(?<trace_id>[0-9A-Za-z]*)\|(?<span_id>[0-9A-Za-z]*)\|(?<message>.*)/m
Time_key timestamp
Time_Format %Y-%m-%d %H:%M:%S.%L
Time_Offset -0600
Time_Keep On
fluent-bit.conf: |
[SERVICE]
Daemon Off
Flush 1
Log_Level debug
Parsers_File /fluent-bit/etc/parsers.conf
Parsers_File /fluent-bit/etc/custom_parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
[INPUT]
Name tail
Log_Level error
multiline.parser multiline-rules
Path /app/logs/*.log
Tag logs
[FILTER]
Name parser
Match *
key_name log
parser named-captures
[FILTER]
Name modify
Match *
Add service ${SERVICE_NAME}
[FILTER]
Name modify
Match *
Add testtag fluentbit
[OUTPUT]
Name stdout
Log_Level trace
Match *
[OUTPUT]
Name opentelemetry
Match *
Log_Level trace
Host ingest.privateotel.com
Port 443
Header token ***************
Log_response_payload True
Tls On
Tls.verify Off
add_label app local
fluent-bit.conf: |
[SERVICE]
Daemon Off
Flush 1
Log_Level debug
Parsers_File /fluent-bit/etc/parsers.conf
Parsers_File /fluent-bit/etc/custom_parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
[INPUT]
Name tail
Log_Level error
multiline.parser multiline-rules
Path /app/logs/*.log
Tag logs
[FILTER]
Name parser
Match *
key_name log
parser named-captures
[FILTER]
Name modify
Match *
Add service ${SERVICE_NAME}
[FILTER]
Name modify
Match *
Add testtag fluentbit
[OUTPUT]
Name stdout
Log_Level trace
Match *
[OUTPUT]
Name opentelemetry
Match *
Log_Level trace
Host ingest.privateotel.com
Port 443
Header token ***************
Log_response_payload True
Tls On
Tls.verify Off
add_label app local
My opentelemetry endpoint receives the request formatted as such:
{
"body": {
"clientIp": "111.222.888",
"loglevel": "INFO",
"message": "This is a test log message for abc application",
"service": "helloworld",
"span_id": "9c1544cc4f7ff369",
"testtag": "fluentbit",
"timestamp": "2024-01-08 18:37:08.150",
"trace_id": "7ada6c95a1bd243fa9013cab515173a9"
},
"instrumentation.name": "",
"instrumentation.version": "",
"observed_timestamp": 0,
"severity_text": "",
"span_id": "",
"trace_id": ""
}
As you can see above every named pair gets nested under the body. The body should simply contain the log message and all the other fields I choose to send should be nested under “fields”. The proper format would looks something like what i have below.
Expected behavior:
{
"body": {
"message": "This is a test log message for abc application",
},
"clientIp": "111.222.888",
"loglevel": "INFO",
"service": "helloworld",
"instrumentation.name": "",
"instrumentation.version": "",
"observed_timestamp": 0,
"testtag": "fluentbit",
"severity_text": "",
"span_id": "9c1544cc4f7ff369",
"timestamp": "2024-01-08 18:37:08.150",
"trace_id": "7ada6c95a1bd243fa9013cab515173a9"
}
To Reproduce Use a configuration similiar to mine. Generate a log line, extract some fields from that log line including traceid, spanid, timestamp, body, etc. Then use the opentelemetry output and point it to a opentelemetry endpoint.
Your Environment I am running in a Kubernetes cluster using a daemonset and the configmap above. The version on fluentbit I am using is 2.1.10. The image is fluent/fluent-bit:2.1.10-debug
Additional context Opentelemetry is unable to correctly process all the fields such as traceid and span id since they are not in the proper format schema For more information on the opentelemetry logging data model and schema, please see below:
Please fix the issue or if there is something I am doing incorrectly in my fluentbit config then please advise. Thank you.
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Reactions: 4
- Comments: 37 (34 by maintainers)
Commits related to this issue
- out_opentelemetry: enhancements for log body and attributes handling (fix #8359) The following patch fix and enhance the OpenTelemetry output connector when handling log records. In Fluent Bit world... — committed to fluent/fluent-bit by edsiper 5 months ago
- out_opentelemetry: enhancements for log body and attributes handling (fix #8359) The following patch fix and enhance the OpenTelemetry output connector when handling log records. In Fluent Bit world... — committed to jenningsloy318/fluent-bit by edsiper 5 months ago
@edsiper @sudomateo
so my understanding is similar to @sudomateo. There are predefined attributes (such as spand_id, trace_id, SeverityText, etc) that should be at the SAME level as body (not within body). Body should just be the body of the log record, aka the message. Body is of type “any” so it can be a simple string or a map, but in either case it should just contain the log message.
Attributes is a map and for adding “custom” fields that are not defined in the data model. Resource is also a map and would contain info about the source of the log however fluentbit otel output plugin does not seem to yet support adding “resource” field.
This is the best document ive found on log data model is here logs-data-model. All frontend applications I have sent logs too (using flentbit) expect the logs to be structured as defined on that page. The example @sudomateo gave above would be a valid example. The only thing I would add to his example is that the body can be one of the 2 ways below, both are technically valid according to otel.
I would also like to note that it seems the fluenbit otel output plugin supports the ability to add attributes, however whenever i tried to define some, it seemed they did not get added nor sent. For example, if i added the following configuration:
Then I would expect my “payload” to contain an attributes section containing “status_code” : “500”, however this did not happen. But maybe the add_label field is not intended for this? If not, then im not sure what add_label is for? Also, if not, then I would suggest adding the ability to add attributes.
Thank you for opening this issue! I spent the last hour or so using different filter plugins (e.g.,
modify
,record_modifier
) to no avail. I then stumbled upon this issue and was glad to know it’s not just me experiencing this. Subscribing for updates!@kevarr Thank you. Yes, i experienced with the approach you described however I want to be able to streamline and simplify the process but just being able to use fluentbit (without a collector) to export in otel format. It seems a lot of people are interested in this. Its unfortunate that the fluentit otel_ouput does not work properly and support all of the otel fields and its very difficult to get the owners/contributors of fluentbit to make changes and respond.
I have a follow up ticket here for the things that are still missing https://github.com/fluent/fluent-bit/issues/8552
A little late to the party here, but wanted to say very nice work on this. I’ve been using the Fluent output and using Otel’s fluent forward receiver which places all of the log records under the “resources” key. This is AFAICT an undocumented feature of the receiver and it also requires complex restructuring pipelines to get all of the attributes/resources in the right places.
To build off of this capability it would be awesome to have support for automatically parsing records supplied by the Kubernetes Filter into resource attributes that match those provided by OpenTelemtry’s k8sattributes processor.
@sudomateo yes, except if the key is something like span-id or trace-id, or etc. Those should NOT go in attributes, they should be their own element (at the same level as body, attributes).
@sudomateo thanks for your feedback. I came up with something similar, I am working in a POC of this logic, I will keep you posted.
Thank you for looking into this @edsiper! According to the example log records in the Log Data Model the log message itself should be under
body
and the attributes, with the exception of trace context fields and severity fields, should be underattributes
.The example given in the linked page is:
we are taking a look at this, thanks for raising the issue
We where experience the same behavior forwarding logs and metrics from FluentBit to Victoria Matrics and Logs, and VM/VL would not accept the data and discarded it !!