fluent-bit: Linux packages not working for centos (aarch64 / arm64v8)
Cloned from #4007
Bug Report
Describe the bug I’m getting an error while running FB in Centos 8 in arm.
To Reproduce
- Start a Centos 7/8 or RedHat linux instance for arm64 (AWS, GCP, etc…): I tested in aws CentOS Stream 8 aarch64 20210603 community AMI
- Follow the installation process https://docs.fluentbit.io/manual/installation/linux/redhat-centos:
sudo yum install td-agent-bit
Expected behavior Fluent-bit works.
Screenshots Execution error:
/opt/td-agent-bit/bin/td-agent-bit -i cpu -t my_cpu -o stdout -m '*'
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
FATAL: error reading `/proc/sys/crypto/fips_enabled' in libgcrypt: Cannot allocate memory
<jemalloc>: Unsupported system page size
Aborted (core dumped)

Service start up error:
$ service td-agent-bit status
td-agent-bit.service - TD Agent Bit
Loaded: loaded (/usr/lib/systemd/system/td-agent-bit.service; disabled; vendor preset: disabled)
Active: failed (Result: core-dump) since Tue 2021-11-02 18:21:44 UTC; 3s ago
Process: 10989 ExecStart=/opt/td-agent-bit/bin/td-agent-bit -c /etc/td-agent-bit/td-agent-bit.conf (code=dumped, signal=ABRT)
Main PID: 10989 (code=dumped, signal=ABRT)
Nov 02 18:21:44: td-agent-bit.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 18:21:44: td-agent-bit.service: Scheduled restart job, restart counter is at 5.
Nov 02 18:21:44: Stopped TD Agent Bit.
Nov 02 18:21:44: td-agent-bit.service: Start request repeated too quickly.
Nov 02 18:21:44: td-agent-bit.service: Failed with result 'core-dump'.
Nov 02 18:21:44: Failed to start TD Agent Bit.
Your Environment Version used: fluent-bit 1.8.9 Configuration: default/standard Environment: AWS Linux Centos 8
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 35 (21 by maintainers)
One good aspect this picked up was that the default
fluent-bit
package was being built without jemalloc so fixed two bugs for the price of 1 @noly !@noly to keep you in the loop, the staging test workflow is in place now so I’ll start extending tests to see if I can replicate although probably after the holidays now.
I’ll pick up to try to include in the staging test updates.
The system page size looks like is 65536:
The next 1.8 release should also have it, basically the next release of either.
Did a fast test and worked!! 👯
Amazing work everyone!! and Kudos to you @patrick-stephens!!!
@noly (and anyone else) can you test the packages from here (once completed) on your target to confirm? https://github.com/fluent/fluent-bit/actions/runs/1956875452
You can now control jemalloc configuration via the
FLB_JEMALLOC_OPTIONS
CMake variable, it defaults to--with-lg-quantum=3
so make sure to add that in any override.I just change
-DFLB_JEMALLOC=On
to-DFLB_JEMALLOC=Off
in the dockerfiles/Dockerfile Maybe this have worse performance, but it worked and it’s enough for me.@ANBUZHIDAO, I would also be very interested in your ARM build.
I’m facing the same error using the Docker image in a Oracle Linux VM with Oracle Kubernetes Engine using ARM instances. Inside any container the page size is reported to be 65536:
I would be fine to have a binary without jemalloc for the time being.