fluent-bit: Fluent-bit input plugin tail doesn't process all logs: scan_blog add(): dismissed:

Bug Report

Describe the bug Fluent Bit is not processing all logs located in /var/log/containers/.

To Reproduce The following messages are displayed:

[2021/10/01 14:40:05] [debug] [input:tail:tail.0] scanning path /var/log/containers/*.log                                                                                                                                                                                                                                               │
│ [2021/10/01 14:40:05] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/activator-85cd6f6f9-nrncf_knative-serving_activator-3b631f27f6667599ae940f94afe6a65a4d1d488e7979fced513fa910082a5ae1.log, inode 404768                                                                                                │
│ [2021/10/01 14:40:05] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/activator-85cd6f6f9-nrncf_knative-serving_activator-ca32320178170fe4198ce1b0bd57d8ea031c7c886a7b0e3d66bb1b78b67613b8.log, inode 921337                                                                                                │
│ [2021/10/01 14:40:05] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/antrea-agent-gql5r_kube-system_antrea-agent-63659cdc8e5ddba3eaf729be280661b45fd198e6d2c7195965be85cdca81f41a.log, inode 536837                                                                                                        │
│ [2021/10/01 14:40:05] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/antrea-agent-gql5r_kube-system_antrea-agent-8726abf73577f597e15716176cfcdce442b159d00ec12f59e439719d824a9585.log, inode 1190181                                                                                                       │
│ [2021/10/01 14:40:05] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/antrea-agent-gql5r_kube-system_antrea-ovs-08045b767f2f8ee421b3b4d8d5b646b49b4e12199ae957cad178dd3d11670ec6.log, inode 663855 
  • Steps to reproduce the problem: Configuration details:

ServiceAccount:

rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - list
  - watch

ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: vmware-system
  labels:
    k8s-app: fluent-bit
apiVersion: v1
data:
  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
    [FILTER]
        Name                  modify
        Match                 kube.*
        Copy                  kubernetes k8s

    [FILTER]
        Name                  nest
        Match                 kube.*
        Operation             lift
        Nested_Under          kubernetes
  filter-record.conf: |
    [FILTER]
        Name                record_modifier
        Match               *
        Record tkg_cluster veba-demo.jarvis.tanzu
        Record tkg_instance veba-demo.jarvis.tanzu
    [FILTER]
        Name                  nest
        Match                 kube.*
        Operation             nest
        Wildcard              tkg_instance*
        Nest_Under            tkg

    [FILTER]
        Name                  nest
        Match                 kube_systemd.*
        Operation             nest
        Wildcard              SYSTEMD*
        Nest_Under            systemd
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     debug
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE input-systemd.conf
    @INCLUDE input-kube-apiserver.conf
    @INCLUDE input-auditd.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE filter-record.conf
    @INCLUDE output-syslog.conf
  input-auditd.conf: |
    [INPUT]
        Name              tail
        Tag               audit.*
        Path              /var/log/audit/audit.log
        Parser            logfmt
        DB                /var/log/flb_system_audit.db
        Mem_Buf_Limit     50MB
        Refresh_Interval  10
        Skip_Long_Lines   On
  input-kube-apiserver.conf: |
    [INPUT]
        Name              tail
        Tag               apiserver_audit.*
        Path              /var/log/kubernetes/audit.log
        Parser            json
        DB                /var/log/flb_kube_audit.db
        Mem_Buf_Limit     50MB
        Refresh_Interval  10
        Skip_Long_Lines   On
  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On
        Refresh_Interval  10
  input-systemd.conf: |
    [INPUT]
        Name                systemd
        Tag                 kube_systemd.*
        Path                /var/log/journal
        DB                  /var/log/flb_kube_systemd.db
        Systemd_Filter      _SYSTEMD_UNIT=kubelet.service
        Systemd_Filter      _SYSTEMD_UNIT=containerd.service
        Read_From_Tail      On
        Strip_Underscores   On
  output-syslog.conf: |
    [OUTPUT]
        Name   syslog
        Match  kube.*
        Host   10.197.79.57
        Port   514
        Mode   tcp
        Syslog_Format        rfc5424
        Syslog_Hostname_key  tkg_cluster
        Syslog_Appname_key   pod_name
        Syslog_Procid_key    container_name
        Syslog_Message_key   message
        Syslog_SD_key        k8s
        Syslog_SD_key        labels
        Syslog_SD_key        annotations
        Syslog_SD_key        tkg

    [OUTPUT]
        Name   syslog
        Match  kube_systemd.*
        Host   10.197.79.57
        Port   514
        Mode   tcp
        Syslog_Format        rfc5424
        Syslog_Hostname_key  tkg_cluster
        Syslog_Appname_key   tkg_instance
        Syslog_Message_key   MESSAGE
        Syslog_SD_key        systemd
  parsers.conf: |
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name        docker-daemon
        Format      regex
        Regex       time="(?<time>[^ ]*)" level=(?<level>[^ ]*) msg="(?<msg>[^ ].*)"
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        # http://rubular.com/r/tjUt3Awgg4
        Name cri
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

    [PARSER]
        Name        logfmt
        Format      logfmt

    [PARSER]
        Name        syslog-rfc5424
        Format      regex
        Regex       ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name        syslog-rfc3164-local
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S
        Time_Keep   On

    [PARSER]
        Name        syslog-rfc3164
        Format      regex
        Regex       /^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
        Time_Key    time
        Time_Format %b %d %H:%M:%S
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name    kube-custom
        Format  regex
        Regex   (?<tag>[^.]+)?\.?(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: vmware-system
  labels:
    k8s-app: fluent-bit
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: fluent-bit
  template:
    metadata:
      labels:
        k8s-app: fluent-bit
    spec:
      containers:
      - image: projects.registry.vmware.com/tkg/fluent-bit:v1.6.9_vmware.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: 2020
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: fluent-bit
        ports:
        - containerPort: 2020
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /api/v1/metrics/prometheus
            port: 2020
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 80m
            memory: 200Mi
          requests:
            cpu: 50m
            memory: 100Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/log
          name: var-log
        - mountPath: /var/log/pods
          name: var-log-pods
        - mountPath: /var/log/containers
          name: var-log-containers
        - mountPath: /var/lib/docker/containers
          name: var-lib-docker-containers
          readOnly: true
        - mountPath: /fluent-bit/etc/
          name: fluent-bit-config
        - mountPath: /run/log
          name: systemd-log
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: fluent-bit
      serviceAccountName: fluent-bit
      terminationGracePeriodSeconds: 10
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /var/log
          type: ""
        name: var-log
      - hostPath:
          path: /var/log/pods
          type: ""
        name: var-log-pods
      - hostPath:
          path: /var/log/containers
          type: ""
        name: var-log-containers
      - hostPath:
          path: /var/lib/docker/containers
          type: ""
        name: var-lib-docker-containers
      - hostPath:
          path: /run/log
          type: ""
        name: systemd-log
      - configMap:
          defaultMode: 420
          name: fluent-bit-config
        name: fluent-bit-config
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

Expected behavior All logs in /var/log/containers/ should be processed.

Your Environment

Additional context Running tail -f manually from within the system on a specific pod log, which is writing to stdout, works.

{"log":"10/03/2021 14:47:13 - Handler Processing Completed ...\n","stream":"stdout","time":"2021-10-03T14:47:13.829672574Z"}
{"log":"\n","stream":"stdout","time":"2021-10-03T14:47:13.829772103Z"}

Logs which e.g. aren’t processed:

root@veba-kn [ /var/log/containers ]# ls -rtl                                                                                                                                                                                          
total 376                                                                                                                                                                                                                                                                                                                                                                                                                                     
lrwxrwxrwx 1 root root 100 Sep 13 21:31 antrea-agent-gql5r_kube-system_antrea-agent-8726abf73577f597e15716176cfcdce442b159d00ec12f59e439719d824a9585.log -> /var/log/pods/kube-system_antrea-agent-gql5r_31aa406a-286c-495b-9dcf-e4036c
2a4756/antrea-agent/3.log                                                                                                                                                                                                              
lrwxrwxrwx 1 root root  98 Sep 13 21:31 antrea-agent-gql5r_kube-system_antrea-ovs-3f300f1d7b28c069df1f34cf37ff89be95d69fc3dc4ea0f269b5bd07ce5d56c1.log -> /var/log/pods/kube-system_antrea-agent-gql5r_31aa406a-286c-495b-9dcf-e4036c2a
4756/antrea-ovs/3.log                                                                                                                                                                                                                  
lrwxrwxrwx 1 root root 102 Sep 13 21:31 envoy-89vct_contour-external_shutdown-manager-c8ed97927c25d465f31cce5ab8bd91d02742504f8cf73ad53e493738d0a17f74.log -> /var/log/pods/contour-external_envoy-89vct_1c947a55-2b86-48bd-b442-c6c51e
c2dd3a/shutdown-manager/3.log                                                                                                                                                                                                          
lrwxrwxrwx 1 root root  91 Sep 13 21:31 envoy-89vct_contour-external_envoy-0ea7a33d12105058f74eae9653dd0266ac99ef2ba7f6cb3a3b04a8ec3bc02525.log -> /var/log/pods/contour-external_envoy-89vct_1c947a55-2b86-48bd-b442-c6c51ec2dd3a/envo
y/3.log                                                                                                                                                                                                                                
lrwxrwxrwx 1 root root 104 Sep 13 21:31 contour-5869594b-7jm89_contour-external_contour-803e6591f657fae9539b64ae4f86fa44cce99b409c5f92979c6045cf4b98b52c.log -> /var/log/pods/contour-external_contour-5869594b-7jm89_cc6cf243-7d3f-483
9-91e8-741ab87f6488/contour/3.log                                                                                                                                                                                                      
lrwxrwxrwx 1 root root 106 Sep 13 21:31 contour-5d47766fd8-n24mz_contour-internal_contour-ae34a8ae0b8398da294c5061ec5c0ef1e9be8cb2979f07077e5e9df12f2bab67.log -> /var/log/pods/contour-internal_contour-5d47766fd8-n24mz_a87131ad-d73a
-4371-a47b-dcc410f3b6e4/contour/3.log                                                                                                                                                                                                  
lrwxrwxrwx 1 root root 100 Sep 13 21:31 coredns-74ff55c5b-mjdlr_kube-system_coredns-60bd5f49def85a0ddc929e2c2da5c793a3c6de00cd6a81bdcfdb21f3d4f45129.log -> /var/log/pods/kube-system_coredns-74ff55c5b-mjdlr_7ef260c1-308e-4162-8a84-2
31d560f8023/coredns/3.log

I’ve also tried running the DS in

securityContext:
          privileged: true

Similar issues I found but which doesn’t provide the solution for this issue: #3857 #4014

Your help would be much appreciated. Thanks

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 12
  • Comments: 93 (16 by maintainers)

Most upvoted comments

Not stale, the issue is still here and is still the reason most of us do not use fluent bit anymore to this day

I also lose my respect to this product because of this bug

I’m using version 1.9.0 and I’m getting dismissed logs. This doesn’t really seems fixed.

I will try to do so, but tbh I dropped fluentbit out of my stack last July given that noone was helping on the matter.

Most versions of Fluent-bit are affected by this bug. I have tested versions 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 2.0.6, 2.0.8, and 2.0.10, and they all exhibited the same issue. The only versions that worked for me were 1.2.x, and 1.3.x. @edsiper @patrick-stephens Can you please review the code for the tail plugin in version 1.3.7? You might find the bug.

Hello all,

I am facing the same issue here using v2.0.8,

Runing fluent-bit on k3s on debian 11. All files are on native sata SSD with EXT4 filesystem.

I was able to identify the root cause to this readlink call from https://github.com/fluent/fluent-bit/blob/v2.0.8/plugins/in_tail/tail_file.c#L1543

I wrote a small c prog to prove the behaviour as follows

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <linux/limits.h>
#include <errno.h>

int main() {
  char * path = "/var/log/containers/traefik-5758f4779c-g57zc_traefik_traefik-ad1fd81239e26f0cf1713977f526bb8b94be95062830d361ef0ff09ef9c22db3.log";
  FILE * f = fopen(path, "r");
  int fd = fileno(f);
  printf("file descriptor is %d\n", fd);
  char tmp[128];
  char * buf = malloc(PATH_MAX);
  if (!buf) {
    printf("can't malloc\n");
    return 0;
  }
  int ret = snprintf(tmp, sizeof(tmp) - 1, "/proc/%i/fd/%i", getpid(), fd);
  printf("ret=%d, tmp=%s\n", ret, tmp);

  // fluent bit's method of using /proc will fail
  int s = readlink(tmp, buf, PATH_MAX);
  printf("s=%d, err=%d\n", s, errno);

  // use the real path will work
  s = readlink(path, buf, PATH_MAX);
  printf("s=%d, err=%d\n", s, errno);
  return 0;

}

Here’s the result:

root@fluent-bit-4gjpq:/# readlink /var/log/containers/traefik-5758f4779c-g57zc_traefik_traefik-ad1fd81239e26f0cf1713977f526bb8b94be95062830d361ef0ff09ef9c22db3.log
/var/log/pods/traefik_traefik-5758f4779c-g57zc_dd7776dd-b8a0-4acf-a130-ec10427ab19a/traefik/0.log
root@fluent-bit-4gjpq:/# ls -l /var/log/containers/traefik-5758f4779c-g57zc_traefik_traefik-ad1fd81239e26f0cf1713977f526bb8b94be95062830d361ef0ff09ef9c22db3.log
lrwxrwxrwx 1 root root 97 Jan 31 17:36 /var/log/containers/traefik-5758f4779c-g57zc_traefik_traefik-ad1fd81239e26f0cf1713977f526bb8b94be95062830d361ef0ff09ef9c22db3.log -> /var/log/pods/traefik_traefik-5758f4779c-g57zc_dd7776dd-b8a0-4acf-a130-ec10427ab19a/traefik/0.log
root@fluent-bit-4gjpq:/# gcc -o test test.c && ./test 
file descriptor is 3
ret=13, tmp=/proc/91/fd/3
s=-1, err=2
s=97, err=2

Further investigate the issue, I found removing the /proc mount will allow the readlink function to work properly. But the issue is that some other functions needs /proc to be mounted, eg. https://github.com/fluent/fluent-bit/blob/v2.0.8/plugins/in_cpu/cpu.c#L95.

daemonSetVolumes:
  - name: varlog
    hostPath:
      path: /var/log
  # - name: proc
  #   hostPath:
  #     path: /proc
...
daemonSetVolumeMounts:
  - name: varlog
    mountPath: /var/log
  # - name: proc
  #   mountPath: /proc

Results:

root@fluent-bit-pb4gx:/# gcc -o test test.c && ./test  
file descriptor is 3
ret=13, tmp=/proc/28/fd/3
s=97, err=0
s=97, err=0

So in order to allow the tail plugin to work properly, we can’t have other plugins that needs /proc to be mounted. It would be nice if we can mount /proc to a different path like /host_proc to avlid this issue. Maybe the temporary workaround is to have separate fluent-bit pods to collect different metrics?

Hard to imagine how so important bug could exist for so long. Main feature of the project just doesn’t work.

I don’t know anything about CephFS but it doesn’t surprise me that disabling inotify helps. I’ll check the error message on Monday to see if it’s related to it but I’m curious about the other users too, I wonder if they are using a similar setup.

I’m having this issue as well on 1.9.6. The problem is that I NEED the long lines to be processed.

Has this been resolved? This seems to be related to not removal of inode during file removal process i.e. (the inode entry not being removed from files_static, files_event[https://github.com/fluent/fluent-bit/blob/master/plugins/in_tail/tail_config.h#L129, https://github.com/fluent/fluent-bit/blob/master/plugins/in_tail/tail_file.c#1108])

I have the same issue https://github.com/fluent/helm-charts/issues/415:

kubectl logs  -n fluent-bit -f fluent-bit-6341f45a-j8h2x | grep -i volume-test
[2023/10/27 17:04:05] [debug] [input:tail:tail.0] inode=1062251 with offset=2026 appended as /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log
[2023/10/27 17:04:05] [debug] [input:tail:tail.0] scan_glob add(): /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log, inode 1062251
[2023/10/27 17:04:05] [debug] [input:tail:tail.0] inode=1062251 file=/var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log promote to TAIL_EVENT
[2023/10/27 17:04:05] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1062251 watch_fd=23 name=/var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log
[2023/10/27 17:05:04] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log, inode 1062251
[2023/10/27 17:06:04] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log, inode 1062251
[2023/10/27 17:07:04] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log, inode 1062251
[2023/10/27 17:08:04] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /var/log/containers/volume-test_exp_volume-test-2c25a3b1342924cffab9fffb48b2f0a971fcf9f10009e4c382a37bc09075134b.log, inode 1062251

CephFS is a network mounted filesystem, afaik inotify will not work because kernel is not aware of filesystem changes in a directory, so it can’t inform any watch process, maybe it’s the issue. If other users are using a network filesystem this could be the case

@kc-dot-io I guess this would do the trick for you, but I really don’t like this idea; with a stable, fully functioning tool you’ would never have to do that. It was my understanding that Fluent bit was a stable tool, but I realise that it’s not the case. The logs are sooo verbose yet the buffers keep filling up with no warning, lines get dismissed with no explanation whatsoever… The only answer in this issue from a staff member was on March 18th…

I think I’m going to have to get rid of fluent bit

fluent-bit版本:1.9.3 问题:使用fluent-bit 的tail插件收集日志时,fluent-bit运行一段时间后,会停止继续收集日志 问题原因:使用multiline.parser进行日志合并时,cont_state可能将若干条日志合并为一行,导致tail停止采集 解决思路: multiline.parser进行日志合并时,受限使用start_state判断是否为开始行,然后使用cont_state判断后续行是否需要合并,因此state_state和cont_state应采用互斥的逻辑, 正确示范: rule “start_state” “/^(\d{4})(.)/" “cont” rule “cont” "/^(?!\d{4})(.)/” “cont” 错误示范: rule “start_state” “/^(\d{4})(.)/" “cont” rule “cont” "/(.)/” “cont”

目前暂时没有遇到tail停止采集的问题

Hi all, it finally worked for our project after replacing Docker with Containerd and by applying the following config: https://github.com/vmware-samples/vcenter-event-broker-appliance/tree/development/files/configs/fluentbit/templates