fluent-bit: [windows] Kubernetes filter on windows not working

Fluent-bit is not loading Kubernetes FILTER

Config used

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |-
    [SERVICE]
        Flush         1
        Log_Level     debug
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    [INPUT]
        Name              tail
        Tag               kube.*
        Path              C:\\ProgramData\\Docker\\containers\\*\\*.log
        Parser            docker
        DB                C:\\flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.ProgramData.Docker.containers
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off

    [OUTPUT]
        Name            es
        Match           *
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        Logstash_Format On
        Replace_Dots    On
        Retry_Limit     False

  parsers.conf: |-
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S

Guide Used

https://github.com/fluent/fluent-bit-kubernetes-logging

Expected behavior

Should filter the logs properly

Logs for reference

Fluent Bit v1.4.2
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/04/07 15:00:57] [ info] Configuration:
[2020/04/07 15:00:57] [ info]  flush time     | 1.000000 seconds
[2020/04/07 15:00:57] [ info]  grace          | 5 seconds
[2020/04/07 15:00:57] [ info]  daemon         | 0
[2020/04/07 15:00:57] [ info] ___________
[2020/04/07 15:00:57] [ info]  inputs:
[2020/04/07 15:00:57] [ info]      tail
[2020/04/07 15:00:57] [ info] ___________
[2020/04/07 15:00:57] [ info]  filters:
[2020/04/07 15:00:57] [ info]      kubernetes.0
[2020/04/07 15:00:57] [ info] ___________
[2020/04/07 15:00:57] [ info]  outputs:
[2020/04/07 15:00:57] [ info]      es.0
[2020/04/07 15:00:57] [ info] ___________
[2020/04/07 15:00:57] [ info]  collectors:
[2020/04/07 15:00:57] [debug] [storage] [cio stream] new stream registered: tail.0
[2020/04/07 15:00:57] [ info] [storage] version=1.0.3, initializing...
[2020/04/07 15:00:57] [ info] [storage] in-memory
[2020/04/07 15:00:57] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/04/07 15:00:57] [ info] [engine] started (pid=10576)
[2020/04/07 15:00:57] [debug] [engine] coroutine stack size: 98302 bytes (96.0K)
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] scanning path C:\\ProgramData\\Docker\\containers\\*\\*.log
[2020/04/07 15:00:58] [error] [sqldb] error=unrecognized token: "237001930390711��"
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] add to scan queue C:\ProgramData\Docker\containers\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6-json.log, offset=0
[2020/04/07 15:00:58] [error] [sqldb] error=unrecognized token: "365917469762065��"
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] add to scan queue C:\ProgramData\Docker\containers\7fbc003699e6b2568f388dcd7963812f7d1cae210619d2c8e7f7b9aa8927e8b0\7fbc003699e6b2568f388dcd7963812f7d1cae210619d2c8e7f7b9aa8927e8b0-json.log, offset=0
[2020/04/07 15:00:58] [error] [sqldb] error=unrecognized token: "675539944143564��"
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] add to scan queue C:\ProgramData\Docker\containers\d390fe73943c10afab770d3935cca8aeaae41d303574612909a63f88787c4752\d390fe73943c10afab770d3935cca8aeaae41d303574612909a63f88787c4752-json.log, offset=0
[2020/04/07 15:00:58] [error] [sqldb] error=unrecognized token: "788129934827827��"
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] add to scan queue C:\ProgramData\Docker\containers\fc03b27983c8b55dcb4609286bba9a068e18e5659ca90d8e2bb95d2265efd7d9\fc03b27983c8b55dcb4609286bba9a068e18e5659ca90d8e2bb95d2265efd7d9-json.log, offset=0
[2020/04/07 15:00:58] [debug] [input:tail:tail.0] 4 files found for 'C:\\ProgramData\\Docker\\containers\\*\\*.log'
[2020/04/07 15:00:58] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc.cluster.local port=443
[2020/04/07 15:00:58] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2020/04/07 15:00:58] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2020/04/07 15:00:58] [ warn] net_tcp_fd_connect: getaddrinfo(host='kubernetes.default.svc.cluster.local'): No such host is known.
[2020/04/07 15:00:58] [error] [filter:kubernetes:kubernetes.0] upstream connection error
[2020/04/07 15:00:58] [ warn] [filter:kubernetes:kubernetes.0] could not get meta for POD fluent-bit-gpkcs
[2020/04/07 15:00:58] [debug] [output:es:es.0] host=elasticsearch port=9200 uri=/_bulk index=fluent-bit type=flb_type
[2020/04/07 15:00:58] [debug] [router] match rule tail.0:es.0
[2020/04/07 15:00:58] [ info] [sp] stream processor started
[2020/04/07 15:00:58] [ warn] [filter:kubernetes:kubernetes.0] invalid pattern for given tag kube.C:\ProgramData\Docker\containers\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6-json.log
2020/04/07 15:00:59] [ warn] [filter:kubernetes:kubernetes.0] invalid pattern for given tag kube.C:\ProgramData\Docker\containers\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6\710d649b86322548cbeee7e1d09d787b280a86ea530390ed9f1b3c42055fcbd6-json.log

Your Environment k8s cluster v1.15.7 with windows node

Additional context let me know if more information is needed

cc @fujimotos

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 28 (12 by maintainers)

Most upvoted comments

@andrew-lozoya (CC @pragadeeshraju) I noticed that the root issue here is the choice of the log file to watch.

Path C:\\ProgramData\\Docker\\containers\\*\\*-json.log

The log files in C:\\ProgramData\\ are raw data produced by Docker, and Fluent Bit almost always should not directly read from them.

Instead, you need to watch files in C:\var\log. This directory is maintained by Kubernetes and contains additional metadata on top of the Docker logs.

So in short, put the following volume mapping to your Kubernetes YAML.


spec:
  containers:
    - name: fluent-bit
      ..
      volumeMounts:
      - mountPath: C:\k
        name: k
      - mountPath: C:\var\log
        name: varlog
      - mountPath: C:\ProgramData
        name: progdata
  volumes:
  - name: k
    hostPath:
      path: C:\k
  - name: varlog
    hostPath:
      path: C:\var\log
  - name: progdata
    hostPath:
      path: C:\ProgramData

And tweak fluent-bit.conf to change the log file to watch (you’ll need to remove the Kube_Tag_Prefix option from the filter config too).

[INPUT]
  Name              tail
  ...
  Path              C:\\var\\log\\*.log

Regarding this issue, I posted a patch to explain how to set up Fluent Bit on Kubernetes properly.

https://github.com/fluent/fluent-bit-docs/pull/353

This manual contains the explanation of the basic logging concept (like log file layout), so I think it should be helpful for you to set up things.

Please feel free to tell me if anything is unclear.