fluent-bit: Kubernetes metadata no longer present in logs when upgrading beyond 1.5.0.

Bug Report

I’m not absolute positive this is a bug, it might very well be a configuration issue on our side. Any help in troubleshooting would be immensely helpful.

Describe the issue We’re using fluentbit as part of aws-for-fluentbit in an EKS cluster on AWS. Upgrading from aws-for-fluentbit 2.3.1 (fluentbit 1.4.2) to 2.6.1 (fluentbit 1.5.2) leaves us without Kubernetes metadata in our logs on CloudWatch and ElasticSearch.

To Reproduce We’re using the following configuration:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File  /fluent-bit/etc/parsers.conf
    
    [INPUT]
        Name          tail
        Path          /var/log/containers/*.log
        Parser        crio
        Tag           kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines   On

    [FILTER]
        Name   kubernetes
        Match  kube.*

    [FILTER]
        Name nest
        Match *
        Operation lift
        Nested_under kubernetes
        Prefix_with kubernetes_

    [FILTER]
        Name record_modifier
        Match *
        Remove_key kubernetes_container_name
        Remove_key kubernetes_docker_id

    [OUTPUT]
        Name cloudwatch
        Match *
        region eu-west-1
        log_group_name /applications/eks-fluentbit/{{ eks.name }}
        log_stream_prefix fluentbit-
        auto_create_group false
  parsers.conf: |
    [PARSER]
        # http://rubular.com/r/izM6olvshn
        Name crio
        Format Regex
        Regex /^(?<time>.+)\b\ (?<stream>stdout|stderr)\ (?<partialflag>F|P)\b\ (?<log>.*)$/
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
...
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentbit
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  selector:
    matchLabels:
      name: fluentbit
  template:
    metadata:
      labels:
        name: fluentbit
    spec:
      serviceAccountName: fluentbit-serviceaccount
      containers:
        - name: aws-for-fluent-bit
          image: amazon/aws-for-fluent-bit:VERSION
          env:
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: aws-cloudwatch-credentials
                  key: accesskeyid
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: aws-cloudwatch-credentials
                  key: secretaccesskey
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: fluent-bit-config
              mountPath: /fluent-bit/etc/
            - name: mnt
              mountPath: /mnt
              readOnly: true
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 50m
              memory: 100Mi
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: fluent-bit-config
          configMap:
            name: fluent-bit-config
        - name: mnt
          hostPath:
            path: /mnt

This is a configuration that we’ve been using for some time, unfortunately I’m not the one who created this configuration.

When we take 2.3.1 (fluentbit 1.4.2) for VERSION, we’ve got log messages like this:

{
    "kubernetes_container_name": "kube-proxy",
    "kubernetes_docker_id": "70d7b94397f621c557f42239319bb4e8e7bea4a651be2eadb9b115edd15612a6",
    "kubernetes_namespace_name": "kube-system",
    "kubernetes_pod_name": "kube-proxy-qm5vx",
    "log": "{\"log\":\"I0903 08:12:19.006415       1 proxier.go:793] Not using `--random-fully` in the MASQUERADE rule for iptables because the local version of iptables does not support it\\n\",\"stream\":\"stderr\",\"time\":\"2020-09-03T08:12:19.00654917Z\"}"
}

After changing VERSION to 2.6.1 (fluentbit 1.5.2) we start seeing log messages like this:

{"log":"{\"log\":\"I0903 07:59:18.193128       1 proxier.go:793] Not using `--random-fully` in the MASQUERADE rule for iptables because the local version of iptables does not support it\\n\",\"stream\":\"stderr\",\"time\":\"2020-09-03T07:59:18.193358872Z\"}"}

Note the missing kubernetes_* fields. Reverting back to the old version also means we get our kubernetes_* fields back.

I’ve tried updating the Kubernetes filter according to the installation manual here:

    [FILTER]
        Name             kubernetes
        Match            kube.*
        Kube_URL         https://kubernetes.default.svc:443
        Kube_CA_File     /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File  /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix  kube.var.log.containers.
        Merge_Log        On
        Merge_Log_Key    log_processed

Which yields the following log syntax:

{"log":"{\"log\":\"I0903 10:13:52.654444       1 main.go:72] Initializing vpc-admission-webhook version 0.2.3.\\n\",\"stream\":\"stderr\",\"time\":\"2020-09-03T10:13:52.655445952Z\"}","log_processed":{"log":"I0903 10:13:52.654444       1 main.go:72] Initializing vpc-admission-webhook version 0.2.3.","stream":"stderr","time":"2020-09-03T10:13:52.655445952Z"}}

Your Environment

  • Environment name and version (e.g. Kubernetes? What version?): AWS EKS 1.16
  • Server type and version: AWS EC2 Nodes
  • Operating System and version: AMI based on “EKS Kubernetes Worker AMI with AmazonLinux2 image, (k8s: 1.16.13, docker:19.03.6ce-4.amzn2)”
  • Filters and plugins: aws-for-fluentbit

Additional context I’m looking for guidance regarding troubleshooting this issue and preferably getting our output as close as possible to the early log output:

{
    "kubernetes_container_name": "kube-proxy",
    "kubernetes_docker_id": "70d7b94397f621c557f42239319bb4e8e7bea4a651be2eadb9b115edd15612a6",
    "kubernetes_namespace_name": "kube-system",
    "kubernetes_pod_name": "kube-proxy-qm5vx",
    "log": "{\"log\":\"I0903 08:12:19.006415       1 proxier.go:793] Not using `--random-fully` in the MASQUERADE rule for iptables because the local version of iptables does not support it\\n\",\"stream\":\"stderr\",\"time\":\"2020-09-03T08:12:19.00654917Z\"}"
}

I’m not very familiar with fluentbit, so any guidance would be greatly appreciated!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 20 (6 by maintainers)

Most upvoted comments

@jonathanjuursema we also had an issue with missing k8s metadata some time ago and the solution was setting the filter’s Buffer_Size field to a much higher value (1MB). It’s possible your k8s resources are big and don’t fit in the default 32kb.