fluent-bit: Hot reload finished but had "requested parser xxx not found" and other failures

Bug Report

Describe the bug Tried 2.1.10 both on container and local host machine, both show same issue below.

Local run errors(log level : info): fluent-bit configuration reference a parsers configuration and had some parser definition in parsers file. When trigger hot-reload, shown error messages like: parser 'extract_kubelet_log' is not registered, requested parser 'extractinfo' not found", Failed to connect to mdsd: dial unix /var/run/mdsd/default_djson.socket: connect: no such file or directory etc.

And fluent-bit hot-reload port seems crashed then as well.

curl -X GET localhost:2020/api/v2/reload
curl: (7) Failed to connect to localhost port 2020: Connection refused

But after restart fluent-bit with systemctl restart fluent-bit, fluent-bit works well.

To Reproduce

container run failure logs(log level : debug): reloadlog.txt

  • Steps to reproduce the problem:
  1. fluent-bit config reference a parser config
  2. send http post request curl -X POST -d '{}' localhost:2020/api/v2/reload to reload a running fluent-bit service
  3. check the status of fluent-bit

Expected behavior

Hot reload with no errors and no need to restart fluent-bit by systemctl or docker command

Screenshots

image image

Your Environment

  • Version used: 2.1.10
  • Configuration:
  • Environment name and version (e.g. Kubernetes? What version?):
  • Server type and version:
  • Operating System and version: Ubuntu 20.04.6 LTS
  • Filters and plugins: Upload the configurations that I use.

config.zip

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

I would raise a separate issue explicitly with the details - it’s confusing to start other issues in comments and hard to say a PR resolves a comment vs an issue.

@patrick-stephens And have another findings during my test which try to make some indentation error in fluent-bit.conf or syntax error in reference lua file. The status of fluent-bit thread is different between ‘reload’ and ‘systemctl restart’. Is this an expected behavior? Reload usually didn’t kill the thread even have errors. Systemctl restart will fail to restart fluent-bit if any error.

Details:

  1. Tried to deliberately make some indentation error in fluent-bit.conf

Reload: Reload post request returned 200 OK and response was {"reload":"done","status":0}. The output of systemctl status fluent-bit looks no error’s during this reload. fluent-bit thread alive. And it seems used old-cached fluent-bit.conf to run. image

And we can find the error in /var/log/syslog image

Restart But when we run systemctl restart fluent-bit the fluent-bit failed to be started. Fluentbit thread killed image

/var/log/syslog image

  1. Tried to deliberately make some syntax error in reference lua file

Reload:

The output of systemctl status fluent-bit show errors [luajit] error loading script: //etc/fluent-bit/modify_user_record.lua:80: 'end' expected (to close 'for' at line 5) near '<eof>' as expected. But fluent-bit thread looks alive. But the 2020 port was not work anymore. image

Consistent with /var/log/syslog image

Restart Same with 1, just different error msg, thread killed same with 1

image

I know it is in the same folder but please try with an absolute path.

@ym11369 please include the config inline to help viewing it. Can you try with an absolute rather than relative path as well for the parser config file?

@cosmo0920 I think we saw this previously with working-directory not being the default?